-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use darn-dmap
for DMAP I/O behind the scenes.
#76
base: develop
Are you sure you want to change the base?
Conversation
* Removes a large portion of the Python code in this library, offloading that responsibility to `darn-dmap`. * Removes the (I think unused?) dependencies `pyyaml` and `pathlib2` and the `setup.py` file. * Removes the generic DMAP I/O (not tied to any file type) Known bug: the write_[iqdat, rawacf, etc] functions fail when a numpy array of shape (1,) is passed through. This array is converted to a scalar value which throws an error in `darn-dmap`.
Okay so using this code to check very quickly all the read and write methods: # Read and write test for pyDARNio with darn-dmap dependency
import pydarnio
import time
# ====================================== IQDAT ======================================
try:
iq_in_file = '/Users/carley/Documents/data/iq/20160303.1800.17.cly.iqdat'
iq_out_file = '/Users/carley/Documents/data/write_files/20160303.1800.17.cly.iqdat'
t1 = time.time()
iq_data = pydarnio.read_iqdat(iq_in_file)
t2 = time.time()
print('------ IQ ------')
print(iq_data[0].keys())
t3 = time.time()
pydarnio.write_iqdat(iq_data, iq_out_file)
t4 = time.time()
print('IQ Read took: ', t2-t1, ' seconds')
print('IQ Write took: ', t4-t3, ' seconds')
print('SUCCESS IQ')
except Exception as e:
print('FAIL IQ')
print(e)
# ====================================== RAWACF ======================================
try:
raw_in_file = '/Users/carley/Documents/data/rawacf/20200713.0200.06.rkn.rawacf'
raw_out_file = '/Users/carley/Documents/data/write_files/20200713.0200.06.rkn.rawacf'
t1 = time.time()
raw_data = pydarnio.read_rawacf(raw_in_file)
t2 = time.time()
print('------ RAWACF ------')
print(raw_data[0].keys())
t3 = time.time()
pydarnio.write_rawacf(raw_data, raw_out_file)
t4 = time.time()
print('RAWACF Read took: ', t2-t1, ' seconds')
print('RAWACF Write took: ', t4-t3, ' seconds')
print('SUCCESS RAWACF')
except Exception as e:
print('FAIL RAWACF')
print(e)
# ====================================== GRID ======================================
try:
grid_in_file = '/Users/carley/Documents/data/grids/20150308.south.grid2'
grid_out_file = '/Users/carley/Documents/data/write_files/20150308.south.grid2'
t1 = time.time()
grid_data = pydarnio.read_grid(grid_in_file)
t2 = time.time()
print('------ GRID ------')
print(grid_data[0].keys())
t3 = time.time()
pydarnio.write_grid(grid_data, grid_out_file)
t4 = time.time()
print('GRID Read took: ', t2-t1, ' seconds')
print('GRID Write took: ', t4-t3, ' seconds')
print('SUCCESS GRID')
except Exception as e:
print('FAIL GRID')
print(e)
# ====================================== MAP ======================================
try:
map_in_file = '/Users/carley/Documents/data/maps/20210101.n.map'
map_out_file = '/Users/carley/Documents/data/write_files/20210101.n.map'
t1 = time.time()
map_data = pydarnio.read_map(map_in_file)
t2 = time.time()
print('------ MAP ------')
print(map_data[0].keys())
t3 = time.time()
pydarnio.write_map(map_data, map_out_file)
t4 = time.time()
print('MAP Read took: ', t2-t1, ' seconds')
print('MAP Write took: ', t4-t3, ' seconds')
print('SUCCESS MAP')
except Exception as e:
print('FAIL MAP')
print(e)
# ====================================== SND ======================================
try:
snd_in_file = '/Users/carley/Documents/data/snd/20230404.00.ice.snd'
snd_out_file = '/Users/carley/Documents/data/write_files/20230404.00.ice.snd'
t1 = time.time()
snd_data = pydarnio.read_snd(snd_in_file)
t2 = time.time()
print('------ SND ------')
print(snd_data[0].keys())
t3 = time.time()
pydarnio.write_snd(snd_data, snd_out_file)
t4 = time.time()
print('SND Read took: ', t2-t1, ' seconds')
print('SND Write took: ', t4-t3, ' seconds')
print('SUCCESS SND')
except Exception as e:
print('FAIL SND')
print(e)
# ====================================== FITACF ======================================
try:
fit_in_file = '/Users/carley/Documents/data/20211102.0000.00.rkn.a.fitacf'
fit_out_file = '/Users/carley/Documents/data/write_files/20211102.0000.00.rkn.a.fitacf'
t1 = time.time()
fit_data = pydarnio.read_fitacf(fit_in_file)
t2 = time.time()
print('------ FITACF ------')
print(fit_data[0].keys())
t3 = time.time()
pydarnio.write_fitacf(fit_data, fit_out_file)
t4 = time.time()
print('FITACF Read took: ', t2-t1, ' seconds')
print('FITACF Write took: ', t4-t3, ' seconds')
print('SUCCESS FITACF')
except Exception as e:
print('FAIL FITACF')
print(e) I get this output:
So to summarise, I tried different fitacf files and they all got the corrupted slist fail (+dep warning for numpy >1.25), same error for SND both in writing, reading was fine. I printed the slist I was trying to write, and it's not a scalar so I don't really know why it's giving that error. Unsupported field found when reading a map file. All others read and wrote fine. Numpy version 1.26.4, 1.25.0 and 1.24.4 tested. Speedy reading is excellent though! |
Did same tests as above and added a read_dmap and a read_fitacf which has a bz2 input: # ====================================== GENERIC DMAP ======================================
try:
fit_in_file = '/Users/carley/Documents/data/20211102.0000.00.rkn.a.fitacf'
fit_out_file = '/Users/carley/Documents/data/write_files/20211102.0000.00.rkn.dmap.fitacf'
t1 = time.time()
fit_data = pydarnio.read_dmap(fit_in_file)
t2 = time.time()
print('------ Generic DMAP ------')
t3 = time.time()
pydarnio.write_fitacf(fit_data, fit_out_file)
t4 = time.time()
print('Generic DMAP Read took: ', t2-t1, ' seconds')
print('Generic DMAP Write took: ', t4-t3, ' seconds')
print('SUCCESS Generic DMAP')
except Exception as e:
print('FAIL Generic DMAP')
print(e)
# ====================================== BZ2 ======================================
try:
fit_in_file = '/Users/carley/Documents/data/20211102.0000.00.rkn.a.fitacf.bz2'
t1 = time.time()
fit_data = pydarnio.read_fitacf(fit_in_file)
t2 = time.time()
print('------ Generic BZ2 ------')
print('BZ2 Read took: ', t2-t1, ' seconds')
print('SUCCESS BZ2')
except Exception as e:
print('FAIL BZ2')
print(e) Both succeeded:
I'm having an issue with the map file still though where it is expecting vector.mlat
It's not unusual to have none of the 'vector' fields and get some partial records (especially the first records for some reason, but I did test with another file and got the 300th record missing vector.mlat too) - I'm assuming it's just not set as optional - in superdarn_formats.py in pydarnio these fields are called 'partial fields' for some reason, but they're essentially the same as extra fields in that they might not be present in the file and that's fine. pyDARNio/pydarnio/dmap/superdarn_formats.py Line 329 in 543ee3a
|
I've fixed the most recent bug that @carleyjmartin has encountered. This branch is ready for a PR as soon as the |
This may be a naive question, but it looks like all of the native python |
Not at all, I didn't realize I hadn't linked it anywhere! The source code is auto-generated from this repository. The repository is wholly written in Rust, and uses a tool to generate Python bindings. I've used a GitHub action that will build wheel files for a wide range of target OS's and CPU architectures, so it should be a simple |
Also, I did conduct some benchmarking to quantify the improved performance. OS: openSUSE Leap 15.3 Reading
The conclusion I drew from this is that the speed boost is most noticeable for files with lots of records, and file types with more fields per record. Writing
The speed boost is more moderate with |
Scope
This PR is a major change to the API, completely changing the DMAP I/O workflow.
SDarnRead
andSDarnWrite
are gone, replaced by one-off reading and writing functions for each specific file type.darn-dmap
.pyyaml
andpathlib2
and thesetup.py
file.issue: N/A
Approval
Number of approvals: 2
Test
Please test the following DMAP I/O functions:
read_iqdat
read_rawacf
read_fitacf
read_grid
read_map
read_snd
read_dmap
write_iqdat
write_rawacf
write_fitacf
write_grid
write_map
write_snd
write_dmap