NEON Data API w/ Python¶
NEON developed an R and Python API for downloading data from their data store.
Cloning Jupyter Tutorials from Github¶
We provide some example Python3 Notebooks and R Markdown Notebooks for downloading lidar and hyperspectral data.
Prerequisite: Installed Anaconda and RStudio-Server, launched Jupyter Notebook or Lab
In the terminal:
- Clone notebooks from NEON Data Science or CyVerse GIS to a location on the VM (e.g.
/home/user/
)
git clone https://github.com/cyverse-gis/neon_data_science cd neon_data_science/lessons
- From Jupyter Notebook or Lab select a data download notebook.
- Follow the notebook instructions.
Download data from CyVerse DataStore in Bash¶
CyVerse uses a system called iRODS to move files onto and off of its Data Store.
iRODS uses multi-threaded file transfers for faster downloads and uploads than traditional wget
or curl
Prerequisite: Installed iRODS iCommands and initiated connection
Use the
ils
command to view your files on the Data StoreChange ownership of the directory where you want to download the data.
sudo chown $USER:iplant-everyone /scratch -R
Create a new directory in
/scratch
mkdir -p /scratch/2016_Campaign/HARV/L1/DiscreteLidar/
Use the iget command to download files from the Data Store
iget -KPQbrvf /iplant/home/shared/NEON_data_institute_2018/2016_Campaign/HARV/L1/DiscreteLidar/ClassifiedLaz /scratch/2016_Campaign/HARV/L1/DiscreteLidar/ClassifiedLaz
In this example we are using the flags to:
-K verify the checksum -P output the progress of the download. -Q use RBUDP (datagram) protocol for the data transfer -b bulk file transfer -r recursive - retrieve subcollections -v verbose -f force - write local files even it they exist already (overwrite them)
Upload data to the CyVerse DataStore in Bash¶
- Use the iput command to upload files to the Data Store
iput -KPQbrvf /scratch/2016_Campaign/HARV/L1/DiscreteLidar/some_results /iplant/home/$USER/neon/results
Note, we are using the same flags as the iget
statement above.
Download data from CyVerse DataStore with CyberDuck¶
After you’ve set up Cyberduck to access your CyVerse DataStore, you can click and drag and drop files to your localhost; or drag and drop files into a second CyberDuck window that is connected to another data source.
Note
Dragging and dropping data with Cyberduck will cause the data to be streamed down to your localhost and then uploaded back to the second remotehost. This will greatly reduce the speed with which you transfer files.
It is strongly suggested you use the Cyberduck CLI tool to move files between two remote data stores.
Jupyter Lab Google Drive Client¶
Google Drive will ask for some authentication through your browser with a token. After you authenticate you can view files in your Google Drive and move them onto the VM.
If you have any data on Google Drive, you can drag and drop them onto your VM.
Jupyter Lab iRODS Client¶
After you’ve authenticated to CyVerse, you will be able to view your data store files.
The Jupyter iRODS Client is not suitable for downloading hundreds of files, but it is useful for finding files and copying their URLs.
Fix or improve this documentation
- Search for an answer: |CyVerse Learning Center|
- Ask us for help: click |Intercom| on the lower right-hand side of the page
- Report an issue or submit a change: |Github Repo Link|
- Send feedback: Tutorials@CyVerse.org