Skip to content

Commit

Permalink
added info on check_usage
Browse files Browse the repository at this point in the history
  • Loading branch information
paciorek committed Sep 21, 2018
1 parent 171e99a commit 6fded4e
Show file tree
Hide file tree
Showing 4 changed files with 38 additions and 6 deletions.
1 change: 1 addition & 0 deletions ideasForNextTime
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
A participant suggested we warn folks in advance that basic understanding of the shell is useful and point them to training materials.
15 changes: 13 additions & 2 deletions intro.html
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ <h3 class="date">Kunal Mishra and Chris Paciorek</h3>
<h1 id="introduction">Introduction</h1>
<p>We'll do this mostly as a demonstration. We encourage you to login to your account and try out the various examples yourself as we go through them.</p>
<p>Much of this material is based on the extensive Savio documention we have prepared and continue to prepare, available at <a href="http://research-it.berkeley.edu/services/high-performance-computing" class="uri">http://research-it.berkeley.edu/services/high-performance-computing</a>.</p>
<p>The materials for this tutorial are available using git at <a href="https://github.com/ucberkeley/savio-training-intro-2018" class="uri">https://github.com/ucberkeley/savio-training-intro-2018</a> or simply as a <a href="https://github.com/ucberkeley/savio-training-intro-2018/archive/master.zip">zip file</a>.</p>
<p>The materials for this tutorial are available using git at the short URL <a href="http://bit.do/F18Savio" class="uri">http://bit.do/F18Savio</a>, the GitHub URL <a href="https://github.com/ucberkeley/savio-training-intro-2018" class="uri">https://github.com/ucberkeley/savio-training-intro-2018</a>, or simply as a <a href="https://github.com/ucberkeley/savio-training-intro-2018/archive/master.zip">zip file</a>.</p>
<h1 id="outline">Outline</h1>
<p>This training session will cover the following topics:</p>
<ul>
Expand Down Expand Up @@ -349,7 +349,7 @@ <h1 id="alternatives-to-the-htc-partition-for-collections-of-serial-jobs">Altern
<li>parallel functionality in MATLAB through <em>parfor</em></li>
</ul></li>
</ul>
<h1 id="monitoring-jobs-and-the-job-queue">Monitoring jobs and the job queue</h1>
<h1 id="monitoring-jobs-the-job-queue-and-overall-usage">Monitoring jobs, the job queue, and overall usage</h1>
<p>The basic command for seeing what is running on the system is <code>squeue</code>:</p>
<pre><code>squeue
squeue -u SAVIO_USERNAME
Expand All @@ -362,6 +362,8 @@ <h1 id="monitoring-jobs-and-the-job-queue">Monitoring jobs and the job queue</h1
<p>For more information on cores, QoS, and additional (e.g., GPU) resources, here's some syntax:</p>
<pre><code>squeue -o &quot;%.7i %.12P %.20j %.8u %.2t %.9M %.5C %.8r %.3D %.20R %.8p %.20q %b&quot; </code></pre>
<p>We provide some <a href="http://research-it.berkeley.edu/services/high-performance-computing/running-your-jobs">tips about monitoring your jobs</a>. (Scroll down to the &quot;Monitoring jobs&quot; section.)</p>
<p>If you'd like to see how much of an FCA has been used:</p>
<pre><code>check_usage.sh -a fc_popgen </code></pre>
<h1 id="example-use-of-standard-software-ipython-and-r-notebooks-through-jupyterhub">Example use of standard software: IPython and R notebooks through JupyterHub</h1>
<p>Savio allows one to <a href="http://research-it.berkeley.edu/services/high-performance-computing/using-jupyter-notebooks-and-jupyterhub-savio">run Jupyter-based notebooks via a browser-based service called Jupyterhub</a>.</p>
<p>Let's see a brief demo of an IPython notebook:</p>
Expand Down Expand Up @@ -502,6 +504,15 @@ <h1 id="example-use-of-standard-software-r">Example use of standard software: R<
}

results</code></pre>
<h1 id="alternative-python-parallelization-dask">Alternative Python Parallelization: Dask</h1>
<p>In addition to iPyParallel, one of the newer tools in the Python space is <a href="http://dask.pydata.org/en/latest/">Dask</a>, which provides out-of-the-box parallelization more easily without much setup or too much additional. Dask, as a python package, extends Numpy/Pandas syntax for arrays and dataframes that already exists and introduces native parallelization to these data structures, which speeds up analyses. Since Dask dataframes/arrays are descendants of the Pandas dataframe and Numpy array, they are compatible with any existing code and can serve as a plug-in replacement, with performance enhancements for multiple cores/nodes. It's also worth noting that Dask is useful for scaling up to large clusters like Savio but can also be useful for speeding up analyses on your local computer. We're including some articles and documentation that may be helpful in getting started:</p>
<ul>
<li><a href="https://dask.pydata.org/en/latest/why.html">Why Dask?</a></li>
<li><a href="https://www.youtube.com/watch?v=ods97a5Pzw0">Standard Dask Demo</a></li>
<li><a href="http://dask.pydata.org/en/latest/api.html">Why every Data Scientist should use Dask</a></li>
<li><a href="https://dask.pydata.org/en/latest/_downloads/daskcheatsheet.pdf">Dask Cheatsheet</a></li>
<li><a href="https://www.youtube.com/watch?v=mjQ7tCQxYFQ">Detailed Dask overview video</a></li>
</ul>
<h1 id="how-to-get-additional-help">How to get additional help</h1>
<ul>
<li>For technical issues and questions about using Savio:
Expand Down
9 changes: 8 additions & 1 deletion intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -442,7 +442,7 @@ Here are some options:
- parallel Python tools such as *ipyparallel*, and *Dask*
- parallel functionality in MATLAB through *parfor*

# Monitoring jobs and the job queue
# Monitoring jobs, the job queue, and overall usage

The basic command for seeing what is running on the system is `squeue`:
```
Expand All @@ -469,6 +469,13 @@ squeue -o "%.7i %.12P %.20j %.8u %.2t %.9M %.5C %.8r %.3D %.20R %.8p %.20q %b"

We provide some [tips about monitoring your jobs](http://research-it.berkeley.edu/services/high-performance-computing/running-your-jobs). (Scroll down to the "Monitoring jobs" section.)

If you'd like to see how much of an FCA has been used:

```
check_usage.sh -a fc_popgen
```


# Example use of standard software: IPython and R notebooks through JupyterHub

Savio allows one to [run Jupyter-based notebooks via a browser-based service called Jupyterhub](http://research-it.berkeley.edu/services/high-performance-computing/using-jupyter-notebooks-and-jupyterhub-savio).
Expand Down
19 changes: 16 additions & 3 deletions intro_slides.html
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ <h1 class="title">Savio introductory training: Basic usage of the Berkeley Savio
<h1>Introduction</h1>
<p>We'll do this mostly as a demonstration. We encourage you to login to your account and try out the various examples yourself as we go through them.</p>
<p>Much of this material is based on the extensive Savio documention we have prepared and continue to prepare, available at <a href="http://research-it.berkeley.edu/services/high-performance-computing" class="uri">http://research-it.berkeley.edu/services/high-performance-computing</a>.</p>
<p>The materials for this tutorial are available using git at <a href="https://github.com/ucberkeley/savio-training-intro-2018" class="uri">https://github.com/ucberkeley/savio-training-intro-2018</a> or simply as a <a href="https://github.com/ucberkeley/savio-training-intro-2018/archive/master.zip">zip file</a>.</p>
<p>The materials for this tutorial are available using git at the short URL <a href="http://bit.do/F18Savio" class="uri">http://bit.do/F18Savio</a>, the GitHub URL <a href="https://github.com/ucberkeley/savio-training-intro-2018" class="uri">https://github.com/ucberkeley/savio-training-intro-2018</a>, or simply as a <a href="https://github.com/ucberkeley/savio-training-intro-2018/archive/master.zip">zip file</a>.</p>
</div>
<div id="outline" class="slide section level1">
<h1>Outline</h1>
Expand Down Expand Up @@ -401,8 +401,8 @@ <h1>Alternatives to the HTC partition for collections of serial jobs</h1>
</ul></li>
</ul>
</div>
<div id="monitoring-jobs-and-the-job-queue" class="slide section level1">
<h1>Monitoring jobs and the job queue</h1>
<div id="monitoring-jobs-the-job-queue-and-overall-usage" class="slide section level1">
<h1>Monitoring jobs, the job queue, and overall usage</h1>
<p>The basic command for seeing what is running on the system is <code>squeue</code>:</p>
<pre><code>squeue
squeue -u SAVIO_USERNAME
Expand All @@ -415,6 +415,8 @@ <h1>Monitoring jobs and the job queue</h1>
<p>For more information on cores, QoS, and additional (e.g., GPU) resources, here's some syntax:</p>
<pre><code>squeue -o &quot;%.7i %.12P %.20j %.8u %.2t %.9M %.5C %.8r %.3D %.20R %.8p %.20q %b&quot; </code></pre>
<p>We provide some <a href="http://research-it.berkeley.edu/services/high-performance-computing/running-your-jobs">tips about monitoring your jobs</a>. (Scroll down to the &quot;Monitoring jobs&quot; section.)</p>
<p>If you'd like to see how much of an FCA has been used:</p>
<pre><code>check_usage.sh -a fc_popgen </code></pre>
</div>
<div id="example-use-of-standard-software-ipython-and-r-notebooks-through-jupyterhub" class="slide section level1">
<h1>Example use of standard software: IPython and R notebooks through JupyterHub</h1>
Expand Down Expand Up @@ -562,6 +564,17 @@ <h1>Example use of standard software: R</h1>

results</code></pre>
</div>
<div id="alternative-python-parallelization-dask" class="slide section level1">
<h1>Alternative Python Parallelization: Dask</h1>
<p>In addition to iPyParallel, one of the newer tools in the Python space is <a href="http://dask.pydata.org/en/latest/">Dask</a>, which provides out-of-the-box parallelization more easily without much setup or too much additional. Dask, as a python package, extends Numpy/Pandas syntax for arrays and dataframes that already exists and introduces native parallelization to these data structures, which speeds up analyses. Since Dask dataframes/arrays are descendants of the Pandas dataframe and Numpy array, they are compatible with any existing code and can serve as a plug-in replacement, with performance enhancements for multiple cores/nodes. It's also worth noting that Dask is useful for scaling up to large clusters like Savio but can also be useful for speeding up analyses on your local computer. We're including some articles and documentation that may be helpful in getting started:</p>
<ul>
<li><a href="https://dask.pydata.org/en/latest/why.html">Why Dask?</a></li>
<li><a href="https://www.youtube.com/watch?v=ods97a5Pzw0">Standard Dask Demo</a></li>
<li><a href="http://dask.pydata.org/en/latest/api.html">Why every Data Scientist should use Dask</a></li>
<li><a href="https://dask.pydata.org/en/latest/_downloads/daskcheatsheet.pdf">Dask Cheatsheet</a></li>
<li><a href="https://www.youtube.com/watch?v=mjQ7tCQxYFQ">Detailed Dask overview video</a></li>
</ul>
</div>
<div id="how-to-get-additional-help" class="slide section level1">
<h1>How to get additional help</h1>
<ul>
Expand Down

0 comments on commit 6fded4e

Please sign in to comment.