quaintitative

I write about my explorations in AI and other quaintitative areas.

For more about me and my other interests, visit playgrd, quaintitative or socials below


Categories
Subscribe

Setting Up a Data Lab Environment - Part 4 - Dockerfile

Sometimes, you may need more packages than what are in the images that you can pull straight from Docker Hub.

Instead of just pulling an image, you can roll your own.

It’s pretty straightforward. In the same folder as the docker-compose.yml file, create a new folder ‘docker’, and within it, a ‘jupyter’ folder.

Inside that folder, create a ‘Dockerfile’. In the ‘Dockerfile’, first state the base image you want to build off.

FROM jupyter/tensorflow-notebook 

Set the user to be root so you are able to install with root permissions.

USER root

Next, just prepend RUN to commands that you would usually run in a terminal to install new packages.

RUN pip install --no-cache-dir lxml
RUN conda install --yes --name root scrapy
RUN conda install --yes --name root spacy
RUN conda install --yes --name root gensim
RUN conda install --yes --name root nltk
RUN conda install --yes --name root pymongo
RUN conda install --yes --name root psycopg2
RUN pip install tweepy
RUN pip install awesome-slugify
RUN pip install feedparser
RUN pip install jieba
RUN mkdir .jupyter

Now, go into your docker-compose.yml. Replace image: jupyter/tensorflow-notebook with build: docker/jupyter.

That’s it. Now when you do a docker-compose up -d, you will get a Jupyter environment that has the packages you included in the Dockerfile.


Articles

Comparing Prompts for Different Large Language Models (Other than ChatGPT)
AI and UIs
Listing NFTs
Extracting and Processing Wikidata datasets
Extracting and Processing Google Trends data
Extracting and Processing Reddit datasets from PushShift
Extracting and Processing GDELT GKG datasets from BigQuery
Some notes relating to Machine Learning
Some notes relating to Python
Using CCapture.js library with p5.js and three.js
Introduction to PoseNet with three.js
Topic Modelling
Three.js Series - Manipulating vertices in three.js
Three.js Series - Music and three.js
Three.js Series - Simple primer on three.js
HTML Scraping 101
(Almost) The Simplest Server Ever
Tweening in p5.js
Logistic Regression Classification in plain ole Javascript
Introduction to Machine Learning Right Inside the Browser
Nature and Math - Particle Swarm Optimisation
Growing a network garden in D3
Data Analytics with Blender
The Nature of Code Ported to Three.js
Primer on Generative Art in Blender
How normal are you? Checking distributional assumptions.
Monte Carlo Simulation of Value at Risk in Python
Measuring Expected Shortfall in Python
Style Transfer X Generative Art
Measuring Market Risk in Python
Simple charts | crossfilter.js and dc.js
d3.js vs. p5.js for visualisation
Portfolio Optimisation with Tensorflow and D3 Dashboard
Setting Up a Data Lab Environment - Part 6
Setting Up a Data Lab Environment - Part 5
Setting Up a Data Lab Environment - Part 4
Setting Up a Data Lab Environment - Part 3
Setting Up a Data Lab Environment - Part 2
Setting Up a Data Lab Environment - Part 1
Generating a Strange Attractor in three.js
(Almost) All the Most Common Machine Learning Algorithms in Javascript
3 Days of Hand Coding Visualisations - Day 3
3 Days of Hand Coding Visualisations - Day 2
3 Days of Hand Coding Visualisations - Day 1
3 Days of Hand Coding Visualisations - Introduction