Jupyter Notebook and File Handling
Our use of Jupyter Notebooks
RapidFire AI has in-built support for Jupyter notebooks, one of the most popular IDEs for AI use case development. Invoking the functions of our API from a Jupyter notebook allows DL users to have a laptop-like experience, while still leveraging the full power of our scalable distributed engine on a cluster.
The Jupyter server in the Trial version of our product comes with the example tutorial use case notebooks in its home directory. You can edit those to add more operations as you wish. You can also duplicate those or upload new notebooks and edit those to invoke RapidFire AI’s API.
An important caveat is that you must run only one notebook at a time on the server. We do not currently support concurrent notebook execution. Please also ensure that you run cells synchronously, i.e., run a cell only after the previous cell has fully finished. Asynchronous or concurrent cell execution can lead to non-deterministic system behaviors.
Notebooks in RapidFire AI can also run cells with regular Python code just like on your laptop. So, you can manipulate regular python data structures (e.g., DataFrames), plot using matplotlib, etc. as you like in between calls to our API.
The above said, our API is not dependent on Jupyter nor is it the only mechanism to use our API. You can also submit your code and experiments via a single command line script or via automated workflow tools such as Airflow without a human in the loop. Contact us if you are interested in these mechanisms; we will add more information on them here in the near future.
About requirements.txt and Custom Code Files
On the Jupyter home directory, you will see a requirements.txt
file.
This is standard practice to specify Python dependencies that must be installed.
RapidFire AI will automatically install these dependencies on all machines in the cluster whenever you
invoke an Experiment
operation such as run_fit()
.
RapidFire AI already ships with popular dependencies such as torch
, torchvision
,
HF transformers
, and pandas
.
So, make sure to list in your requirements.txt any use case-specific new packages you want installed.
You can also define any custom Python classes in a .py
code file and put it in the Jupyter
home directory.
RapidFire AI will automatically copy those files to all machines in the cluster.
This lets you import your custom classes as a package in your MLSpec
code.