Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readme update to adhere to template #83

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Conversation

AUdaltsova
Copy link
Contributor

Pull Request

Description

Redid the readme to be the same as the template. Open to comments/suggestions as I am very wordy and this could probably use some trimming. Some notes:

  • could've gone a bit overboard with 'this is in development' shouting. Can maybe remove the second paragraph in description since note says pretty much the same thing
  • I don't like that the FAQ is a) massive b) is just one question which c) potentially doesn't belong here at all? I think it's important to have it answered somewhere visible but maybe not in this repo. Or maybe it's fine. (Also very much correct me if the answer is off anywhere)
  • We don't have docs for this as far as I know (though the torch datasets readme is a big help, thanks James & Peter!) and I think it would be useful to have some very high-level docs of how everything fits together and what the folders are, etc. But this can probably wait till dsampler is a bit more stable, we might change things a bit still.

Checklist:

  • My code follows OCF's coding style guidelines
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked my code and corrected any misspellings


### How does ocf-data-sampler deal with data sources that use different projections (e.g. some are in latitude-longitude, and some in OSGB)?

When creating samples, we make an areal crop of a preset size centred around a point of interest (POI, usually a solar or wind farm). The size of the crop is set not in miles or kilometres, but in 'pixels', which would be different for different data sources, depending on their spatial resolution, projections they use, and where the POI is. For example, a latitude-longitude source with a 1° resolution will have pixel sizes corresponding to very different 'surface' distances (that you might measure in, e.g., kilometres) from a source with 0.1° resolution. The pixel size will even be different for the same source depending on how close the POI is to the equator!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great answer, really well explained!


## Contributors ✨
**ocf-data-sampler** contains all the infrastructure needed to create batches and feed them to our models, such as [PVNet](https://github.com/openclimatefix/PVNet/). The data we work with is usually too heavy to do this on the fly, so that's where this repo comes in: handling steps like opening the data, selecting the right samples, normalising and reshaping, and saving to and reading from disk.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good short intro to the library, two comments:

  1. The word infrastructure might make things a bit confusing since infrastructure usually relates to a specific meaning in software development e.g. the services defined in our ocf-infrastructure repo so perhaps worth rewording slightly
  2. Is it worth mentioning weather/energy data somewhere so it's clear that this is what is being sampled


## Documentation

**ocf-data-sampler** doesn't have external documentation; you can read a bit about how our torch datasets work in the Readme [here](https://github.com/openclimatefix/ocf-data-sampler/tree/readme-update/ocf_data_sampler/torch_datasets).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**ocf-data-sampler** doesn't have external documentation; you can read a bit about how our torch datasets work in the Readme [here](https://github.com/openclimatefix/ocf-data-sampler/tree/readme-update/ocf_data_sampler/torch_datasets).
**ocf-data-sampler** doesn't have external documentation _yet_; you can read a bit about how our torch datasets work in the Readme [here](https://github.com/openclimatefix/ocf-data-sampler/tree/readme-update/ocf_data_sampler/torch_datasets).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants