Skip to content

Short shell scripts for initialising data science Linux environments.

Notifications You must be signed in to change notification settings

tompearson-Defra/init-scripts

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

init-scripts

Short shell scripts for initialising data science Linux environments.

Environments

Environment Install Guide Description
DASH add repo, add init script, add environmental variables init script: /Repos/USER/init-scripts/DASH.sh DASH initialisation global script for databricks clusters. †
DBFS Clean schedule for weekends A schedulable job to remove user created files not in lab area.
rootcert add manually Root Certification to solve SSH issue requires secret and such not stored on GitHub.
SCE run in terminal wget -O- https://raw.githubusercontent.com/Defra-Data-Science-Centre-of-Excellence/init-scripts/main/SCE.sh | bash SCE initialisation script for SCE virtual Linux machine.
SwapFile run in terminal wget -O- https://raw.githubusercontent.com/Defra-Data-Science-Centre-of-Excellence/init-scripts/main/src/swapfile.sh | bash For small SCE machines, reduce Out-of-Memory errors with extra swap storage.

Sedona requires extra spark config

Libraries: in depth

Script Lang Library Group
Runtime 12 Databricks Runtime
Runtime 12 R
Runtime 12 py
Base ppa ppa:c2d4u.team/c2d4u4.0+ R-Cran binary install
Base ppa ppa:ubuntugis/ppa Geospatial
Base bin parallel GNU
Base R renv RStudio Connect
Base R devtools RStudio Connect
Base R rstudioapi RStudio Connect
Base R packrat RStudio Connect
Base R rsconnect RStudio Connect
Base R dt Shiny
Base R shinyjs Shiny
Base R shinydashboard Shiny
Base R shinycssloaders Shiny
Base R sf Geospatial
Base R raster Geospatial
Base R leaflet Geospatial
Base R arrow
Base R plotly
Base R biocmanager Farm Stats
Base R bs4dash Farm Stats
Base R janitor Farm Stats
Base R odbc Farm Stats
Base R rgdal Farm Stats
Base R rpostgres Farm Stats
Base R srvyr Farm Stats
Base R zoo Farm Stats
Base py pandas
Base py matplotlib
Base py openpyxl
Base bin libgdal-dev Geospatial
Base bin libgeos-dev Geospatial
Base bin libproj-dev Geospatial
Base bin libspatialindex-dev Geospatial
Base bin libsqlite3-mod-spatialite Geospatial
Base py spatialite Geospatial
Base py rtree Geospatial
Base py pyproj Geospatial
Base py pyogrio Geospatial
Base py geopandas Geospatial
Base py geocube Geospatial
Geo jar geotools-wrapper Sedona
Geo jar sedona-python-adapter Sedona
Geo jar sedona-viz Sedona
Geo py apache-sedona Sedona
Geo py databricks-mosaic Mosaic

About

Short shell scripts for initialising data science Linux environments.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 41.0%
  • Shell 38.6%
  • R 20.4%