diff --git a/joss.05755/10.21105.joss.05755.crossref.xml b/joss.05755/10.21105.joss.05755.crossref.xml new file mode 100644 index 0000000000..8448954d26 --- /dev/null +++ b/joss.05755/10.21105.joss.05755.crossref.xml @@ -0,0 +1,339 @@ + + + + 20231116T170444-68d5f67ef7febb92d7b6b965d3d76e36b965d0c5 + 20231116170444 + + JOSS Admin + admin@theoj.org + + The Open Journal + + + + + Journal of Open Source Software + JOSS + 2475-9066 + + 10.21105/joss + https://joss.theoj.org + + + + + 11 + 2023 + + + 8 + + 91 + + + + pvOps: a Python package for empirical analysis of +photovoltaic field data + + + + Kirk L. + Bonney + https://orcid.org/0009-0006-2383-1634 + + + Thushara + Gunda + https://orcid.org/0000-0003-1945-4064 + + + Michael W. + Hopwood + https://orcid.org/0000-0001-6190-1767 + + + Hector + Mendoza + https://orcid.org/0009-0007-5812-606X + + + Nicole D. + Jackson + https://orcid.org/0000-0002-3814-9906 + + + + 11 + 16 + 2023 + + + 5755 + + + 10.21105/joss.05755 + + + http://creativecommons.org/licenses/by/4.0/ + http://creativecommons.org/licenses/by/4.0/ + http://creativecommons.org/licenses/by/4.0/ + + + + Software archive + 10.5281/zenodo.10126530 + + + GitHub review issue + https://github.com/openjournals/joss-reviews/issues/5755 + + + + 10.21105/joss.05755 + https://joss.theoj.org/papers/10.21105/joss.05755 + + + https://joss.theoj.org/papers/10.21105/joss.05755.pdf + + + + + + RdTools: An open source python library for PV +degradation analysis + Deceglie + 2018 + Deceglie, M. G., Jordan, D., Nag, A., +Deline, C. A., & Shinn, A. (2018). RdTools: An open source python +library for PV degradation analysis. National Renewable Energy +Lab.(NREL), Golden, CO (United States). + + + A machine learning evaluation of maintenance +records for common failure modes in PV inverters + Gunda + IEEE Access + 8 + 10.1109/ACCESS.2020.3039182 + 2020 + Gunda, T., Hackett, S., Kraus, L., +Downs, C., Jones, R., McNalley, C., Bolen, M., & Walker, A. (2020). +A machine learning evaluation of maintenance records for common failure +modes in PV inverters. IEEE Access, 8, 211610–211620. +https://doi.org/10.1109/ACCESS.2020.3039182 + + + Pvlib python: A python package for modeling +solar energy systems + Holmgren + Journal of Open Source +Software + 29 + 3 + 10.21105/joss.00884 + 2018 + Holmgren, W. F., Hansen, C. W., & +Mikofski, M. A. (2018). Pvlib python: A python package for modeling +solar energy systems. Journal of Open Source Software, 3(29), 884. +https://doi.org/10.21105/joss.00884 + + + Neural network-based classification of +string-level IV curves from physically-induced failures of photovoltaic +modules + Hopwood + IEEE Access + 8 + 10.1109/ACCESS.2020.3021577 + 2020 + Hopwood, M. W., Gunda, T., Seigneur, +H., & Walters, J. (2020). Neural network-based classification of +string-level IV curves from physically-induced failures of photovoltaic +modules. IEEE Access, 8, 161480–161487. +https://doi.org/10.1109/ACCESS.2020.3021577 + + + Classification of photovoltaic failures with +hidden markov modeling, an unsupervised statistical +approach + Hopwood + Energies + 14 + 15 + 10.3390/en15145104 + 2022 + Hopwood, M. W., Patel, L., & +Gunda, T. (2022). Classification of photovoltaic failures with hidden +markov modeling, an unsupervised statistical approach. Energies, 15(14), +5104. https://doi.org/10.3390/en15145104 + + + Generation of data-driven expected energy +models for photovoltaic systems + Hopwood + Applied Sciences + 4 + 12 + 10.3390/app12041872 + 2022 + Hopwood, M. W., & Gunda, T. +(2022). Generation of data-driven expected energy models for +photovoltaic systems. Applied Sciences, 12(4), 1872. +https://doi.org/10.3390/app12041872 + + + Physics-based method for generating fully +synthetic IV curve training datasets for machine learning classification +of PV failures + Hopwood + Energies + 14 + 15 + 10.3390/en15145085 + 2022 + Hopwood, M. W., Stein, J. S., Braid, +J. L., & Seigneur, H. P. (2022). Physics-based method for generating +fully synthetic IV curve training datasets for machine learning +classification of PV failures. Energies, 15(14), 5085. +https://doi.org/10.3390/en15145085 + + + pvOps: Improving operational assessments +through data fusion + Mendoza + 2021 IEEE 48th photovoltaic specialists +conference (PVSC) + 10.1109/PVSC43889.2021.9518439 + 2021 + Mendoza, H., Hopwood, M., & +Gunda, T. (2021). pvOps: Improving operational assessments through data +fusion. 2021 IEEE 48th Photovoltaic Specialists Conference (PVSC), +0112–0119. +https://doi.org/10.1109/PVSC43889.2021.9518439 + + + Pandas-dev/pandas: pandas + The pandas development team + 10.5281/zenodo.3509134 + 2020 + The pandas development team. (2020). +Pandas-dev/pandas: pandas (latest). Zenodo. +https://doi.org/10.5281/zenodo.3509134 + + + Identifying degradation modes of photovoltaic +modules using unsupervised machine learning on electroluminescense +images + Pierce + 2020 47th IEEE photovoltaic specialists +conference (PVSC) + 10.1109/PVSC45281.2020.9301021 + 2020 + Pierce, B. G., Karimi, A. M., Liu, +J., French, R. H., & Braid, J. L. (2020). Identifying degradation +modes of photovoltaic modules using unsupervised machine learning on +electroluminescense images. 2020 47th IEEE Photovoltaic Specialists +Conference (PVSC), 1850–1855. +https://doi.org/10.1109/PVSC45281.2020.9301021 + + + Performance monitoring using pecos (v. +0.1) + Klise + 10.2172/1734479 + 2016 + Klise, K. A., & Stein, J. S. +(2016). Performance monitoring using pecos (v. 0.1). Sandia National +Laboraties. https://doi.org/10.2172/1734479 + + + Collaborative data science + Plotly Technologies Inc. + 2015 + Plotly Technologies Inc. (2015). +Collaborative data science. Plotly Technologies Inc. +https://plot.ly + + + Seaborn: Statistical data +visualization + Waskom + Journal of Open Source +Software + 60 + 6 + 10.21105/joss.03021 + 2021 + Waskom, M. L. (2021). Seaborn: +Statistical data visualization. Journal of Open Source Software, 6(60), +3021. https://doi.org/10.21105/joss.03021 + + + Matplotlib: A 2D graphics +environment + Hunter + Computing in Science & +Engineering + 3 + 9 + 10.1109/MCSE.2007.55 + 2007 + Hunter, J. D. (2007). Matplotlib: A +2D graphics environment. Computing in Science & Engineering, 9(3), +90–95. https://doi.org/10.1109/MCSE.2007.55 + + + Natural language processing with +python + Bird + 2009 + Bird, S., Klein, E., & Loper, E. +(2009). Natural language processing with python. O’Reilly +Media. + + + Keras + Chollet + 2015 + Chollet, F., & others. (2015). +Keras. https://keras.io. + + + Scikit-learn: Machine learning in +Python + Pedregosa + Journal of Machine Learning +Research + 12 + 2011 + Pedregosa, F., Varoquaux, G., +Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., +Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., +Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). +Scikit-learn: Machine learning in Python. Journal of Machine Learning +Research, 12, 2825–2830. + + + PVAnalytics: A python package for automated +processing of solar time series data + Perry + 2022 + Perry, K., Vining, W., Anderson, K., +Muller, M., & Hansen, C. (2022). PVAnalytics: A python package for +automated processing of solar time series data. National Renewable +Energy Lab.(NREL), Golden, CO (United States). + + + + + + diff --git a/joss.05755/10.21105.joss.05755.jats b/joss.05755/10.21105.joss.05755.jats new file mode 100644 index 0000000000..2ae8495b7a --- /dev/null +++ b/joss.05755/10.21105.joss.05755.jats @@ -0,0 +1,661 @@ + + +
+ + + + +Journal of Open Source Software +JOSS + +2475-9066 + +Open Journals + + + +5755 +10.21105/joss.05755 + +pvOps: a Python package for empirical analysis of +photovoltaic field data + + + +https://orcid.org/0009-0006-2383-1634 + +Bonney +Kirk L. + + +* + + +https://orcid.org/0000-0003-1945-4064 + +Gunda +Thushara + + + + +https://orcid.org/0000-0001-6190-1767 + +Hopwood +Michael W. + + + + +https://orcid.org/0009-0007-5812-606X + +Mendoza +Hector + + + + +https://orcid.org/0000-0002-3814-9906 + +Jackson +Nicole D. + + + + + +Sandia National Laboratories, USA + + + + +University of Central Florida, USA + + + + +* E-mail: + + +4 +4 +2023 + +8 +91 +5755 + +Authors of papers retain copyright and release the +work under a Creative Commons Attribution 4.0 International License (CC +BY 4.0) +2022 +The article authors + +Authors of papers retain copyright and release the work under +a Creative Commons Attribution 4.0 International License (CC BY +4.0) + + + +Python +photovoltaic +time series +machine learning +natural language processing + + + + + + Summary +

The purpose of pvOps is to support empirical + evaluations of data collected in the field related to the operations + and maintenance (O&M) of photovoltaic (PV) power plants. + pvOps presently contains modules that address + the diversity of field data, including text-based maintenance logs, + current-voltage (IV) curves, and timeseries of production information. + The package functions leverage machine learning, visualization, and + other techniques to enable cleaning, processing, and fusion of these + datasets. These capabilities are intended to facilitate easier + evaluation of field patterns and extraction of relevant insights to + support reliability-related decision-making for PV sites. The + open-source code, examples, and instructions for installing the + package through PyPI can be accessed through the + GitHub + repository.

+
+ + Statement of Need +

Continued interest in PV deployment across the world has resulted + in increased awareness of needs associated with managing reliability + and performance of these systems during operation. Current open-source + packages for PV analysis focus on theoretical evaluations of solar + power simulations (e.g., pvlib + (Holmgren + et al., 2018)), data cleaning and feature development for + production data (e.g. pvanalytics + (Perry + et al., 2022)), specific use cases of empirical evaluations + (e.g., RdTools + (Deceglie + et al., 2018) and Pecos + (Klise + & Stein, 2016) for degradation analysis), or analysis of + electroluminescene images (e.g., PVimage + (Pierce + et al., 2020)); see + openpvtools + for a list of additional open source PV packages. However, a general + package that can support data-driven, exploratory evaluations of + diverse field collected information is currently lacking. For example, + a maintenance log that describes an inverter failure may be temporally + correlated to a dip in production levels. Identifying such + relationships across different types of field data can improve + understanding of the impacts of certain types of failures on a PV + plant. To address this gap, we present pvOps, + an open-source Python package that can be used by researchers and + industry analysts alike to evaluate and extract insights from + different types of data routinely collected during PV field + operations.

+

PV data collected in the field varies greatly in structure (e.g., + timeseries and text records) and quality (e.g., completeness and + consistency). The data available for analysis is frequently + semi-structured. Furthermore, the level of detail collected between + different owners/operators might vary. For example, some may capture a + general start and end time for an associated event whereas others + might include additional time details for different resolution + activities. This diversity in data types and structures often leads to + data being under-utilized due to the amount of manual processing + required. To address these issues, pvOps + provides a suite of data processing, cleaning, and visualization + methods to leverage insights across a broad range of data types, + including operations and maintenance records, production timeseries, + and IV curves. The functions within pvOps + enable users to better parse available data to understand patterns in + outages and production losses.

+
+ + Package Overview +

The following table summarizes the four modules within + pvOps by presenting: the type of data they + analyze, example data features, and highlights of relevant + functions.

+

Table 1. Summary of modules and functions within + ‘pvOps‘

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ModuleType of dataExample data featuresHighlights of functions
textO&M recordstimestamps, issue + description, issue + classificationfill data gaps in dates and categorical records, visualize + word clusters and patterns over time
timeseriesProduction datasite, timestamp, + power production, + irradianceestimate expected energy with multiple models, evaluate + inverter clipping
text2timeO&M records and production datasee entries for text and + timeseries modules aboveanalyze overlaps between O&M and production + (timeseries) records, visualize overlaps between O&M + records and production data
ivIV recordscurrent, voltage, + irradiance, temperaturesimulate IV curves with physical faults, extract diode + parameters from IV curves, classify faults using IV + curves
+
+

The functions within each module can be used to build pipelines + that integrate relevant data processing, fusion, and visualization + capabilities to support user endgoals. For example, a user with IV + curve data could build a pipeline that leverages functions within the + iv module to process and extract diode + parameters within IV curves as well as train models to support + classifications based on fault type. A pipeline could be also be built + that leverages functions across modules if a user has access to + multiple types of data (e.g., both O&M and production records). A + sample end-to-end workflow using pvOps modules + could be:

+ + +

Use functions within the text module to + systematically review data quality issues within O&M records, + train a machine learning model on available records, and use the + model to estimate possible labels for missing entries

+
+ +

Leverage the functions within the + timeseries module, use machine learning to + develop their own expected energy models for a given time series + of irradiance and system size details, or use a pre-trained + expected energy model + (Hopwood + & Gunda, 2022) or leverage industry standard equations + as a basis for evaluating possible production losses

+
+ +

Couple outputs from the above two analyses (using functions in + the text2time module) based on timestamps + to develop summaries and visualizations of production impacts + observed during these periods

+
+
+

The + package + documentation for pvOps provides + thorough examples exploring the various capabilities of each module. + Additional details about the iv module + capabilities are captured in + (Hopwood + et al., 2020; + Hopwood, + Stein, et al., 2022) while more information about the design + and development of the text, + timeseries, and + text2time modules are captured in + (Mendoza + et al., 2021). Key package dependencies of + pvOps include pandas + (The + pandas development team, 2020), sklearn + (Pedregosa + et al., 2011), nltk + (Bird + et al., 2009), and keras + (Chollet + & others, 2015) for analysis and + matplotlib + (Hunter, + 2007), seaborn + (Waskom, + 2021), and plotly + (Plotly + Technologies Inc., 2015) for visualization.

+
+ + Ongoing Development +

The pvOps functionality and documentation + continues to be improved and updated as new empirical techniques are + identified. For example, research efforts have demonstrated utility of + natural language processing techniques (e.g., topic modeling) and + survival analyses to support evaluation of patterns in O&M records + (Gunda + et al., 2020). Additional statistical methods, such as Hidden + Markov Modeling, have also been successfully used to support + classification of failures within production data + (Hopwood, + Patel, et al., 2022). These and other capabilities will + continue to be added to the package to improve its utility for + supporting empirical analyses of field data.

+
+ + CRediT Authorship Statement +

KLB: Writing - Original Draft, Software - Software Development, + Software - Testing; TG: Conceptualization, Writing - Original Draft, + Software - Design; MWH: Writing - Review & Editing, Software - + Software Development; HM: Writing - Review & Editing, Software - + Software Development; NDJ: Conceptualization, Funding Acquisition, + Project Administration, Supervision, Writing - Review & + Editing.

+
+ + Acknowledgements +

This material is supported by the U.S. Department of Energy, Office + of Energy Efficiency and Renewable Energy - Solar Energy Technologies + Office. Sandia National Laboratories, a multimission laboratory + managed and operated by National Technology and Engineering Solutions + of Sandia LLC, a wholly owned subsidiary of Honeywell International + Inc. for the U.S. Department of Energy’s National Nuclear Security + Administration under contract DE-NA0003525.

+
+ + + + + + + DeceglieMichael G + JordanDirk + NagAmbarish + DelineChristopher A + ShinnAdam + + RdTools: An open source python library for PV degradation analysis + National Renewable Energy Lab.(NREL), Golden, CO (United States) + 2018 + + + + + + GundaThushara + HackettSean + KrausLaura + DownsChristopher + JonesRyan + McNalleyChristopher + BolenMichael + WalkerAndy + + A machine learning evaluation of maintenance records for common failure modes in PV inverters + IEEE Access + IEEE + 2020 + 8 + 10.1109/ACCESS.2020.3039182 + 211610 + 211620 + + + + + + HolmgrenWilliam F + HansenClifford W + MikofskiMark A + + Pvlib python: A python package for modeling solar energy systems + Journal of Open Source Software + 2018 + 3 + 29 + 10.21105/joss.00884 + 884 + + + + + + + HopwoodMichael W + GundaThushara + SeigneurHubert + WaltersJoseph + + Neural network-based classification of string-level IV curves from physically-induced failures of photovoltaic modules + IEEE Access + IEEE + 2020 + 8 + 10.1109/ACCESS.2020.3021577 + 161480 + 161487 + + + + + + HopwoodMichael W + PatelLekha + GundaThushara + + Classification of photovoltaic failures with hidden markov modeling, an unsupervised statistical approach + Energies + MDPI + 2022 + 15 + 14 + 10.3390/en15145104 + 5104 + + + + + + + HopwoodMichael W + GundaThushara + + Generation of data-driven expected energy models for photovoltaic systems + Applied Sciences + MDPI + 2022 + 12 + 4 + 10.3390/app12041872 + 1872 + + + + + + + HopwoodMichael W + SteinJoshua S + BraidJennifer L + SeigneurHubert P + + Physics-based method for generating fully synthetic IV curve training datasets for machine learning classification of PV failures + Energies + MDPI + 2022 + 15 + 14 + 10.3390/en15145085 + 5085 + + + + + + + MendozaHector + HopwoodMichael + GundaThushara + + pvOps: Improving operational assessments through data fusion + 2021 IEEE 48th photovoltaic specialists conference (PVSC) + IEEE + 2021 + 10.1109/PVSC43889.2021.9518439 + 0112 + 0119 + + + + + + The pandas development team + + Pandas-dev/pandas: pandas + Zenodo + 202002 + https://doi.org/10.5281/zenodo.3509134 + 10.5281/zenodo.3509134 + + + + + + PierceBenjamin G + KarimiAhmad Maroof + LiuJiQi + FrenchRoger H + BraidJennifer L + + Identifying degradation modes of photovoltaic modules using unsupervised machine learning on electroluminescense images + 2020 47th IEEE photovoltaic specialists conference (PVSC) + IEEE + 2020 + 10.1109/PVSC45281.2020.9301021 + 1850 + 1855 + + + + + + KliseKatherine A + SteinJoshua S + + Performance monitoring using pecos (v. 0.1) + Sandia National Laboraties + 2016 + 10.2172/1734479 + + + + + + Plotly Technologies Inc. + + Collaborative data science + Plotly Technologies Inc. + Montreal, QC + 2015 + https://plot.ly + + + + + + WaskomMichael L. + + Seaborn: Statistical data visualization + Journal of Open Source Software + The Open Journal + 2021 + 6 + 60 + https://doi.org/10.21105/joss.03021 + 10.21105/joss.03021 + 3021 + + + + + + + HunterJ. D. + + Matplotlib: A 2D graphics environment + Computing in Science & Engineering + IEEE COMPUTER SOC + 2007 + 9 + 3 + 10.1109/MCSE.2007.55 + 90 + 95 + + + + + + BirdSteven + KleinEwan + LoperEdward + + Natural language processing with python + O’Reilly Media + 2009 + + + + + + CholletFrançois + others + + Keras + https://keras.io + 2015 + + + + + + PedregosaF. + VaroquauxG. + GramfortA. + MichelV. + ThirionB. + GriselO. + BlondelM. + PrettenhoferP. + WeissR. + DubourgV. + VanderplasJ. + PassosA. + CournapeauD. + BrucherM. + PerrotM. + DuchesnayE. + + Scikit-learn: Machine learning in Python + Journal of Machine Learning Research + 2011 + 12 + 2825 + 2830 + + + + + + PerryKirsten + ViningWilliam + AndersonKevin + MullerMatthew + HansenCliff + + PVAnalytics: A python package for automated processing of solar time series data + National Renewable Energy Lab.(NREL), Golden, CO (United States) + 2022 + + + + +
diff --git a/joss.05755/10.21105.joss.05755.pdf b/joss.05755/10.21105.joss.05755.pdf new file mode 100644 index 0000000000..84deacbb47 Binary files /dev/null and b/joss.05755/10.21105.joss.05755.pdf differ