Skip to content

Commit

Permalink
Make DefectThermodynamics initialisation (defect entry sorting) mor…
Browse files Browse the repository at this point in the history
…e efficient, and update `TODO`s
  • Loading branch information
kavanase committed Dec 17, 2023
1 parent cefd810 commit cdfe532
Show file tree
Hide file tree
Showing 4 changed files with 55 additions and 66 deletions.
14 changes: 3 additions & 11 deletions docs/Dev_ToDo.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,6 @@
- Automate `pydefect` shallow defect analysis? At least have notebook showing how to manually do this (Adair's done before?).
- Should tag parsed defects with `is_shallow` (or similar), and then omit these from plotting/analysis
(and note this behaviour in examples/docs)
- Change formation energy plotting and tabulation to `DefectPhaseDiagram` methods rather than standalone
functions – with `pymatgen` update what's the new architecture?
- Better automatic defect formation energy plot colour handling (auto-change colormap based on number of defects, set similar colours for similar defects (types and inequivalent sites)) – and more customisable?
- `aide` labelling of defect species in formation energy plots? See `labellines` package for this (as used in `pymatgen-analysis-defects` chempots plotting)
- Ordering of defects plotted (and thus in the legend) should be physically relevant (whether by energy, or defect type etc.)
Expand All @@ -88,19 +86,13 @@
workflow which ppl often mess up. Can use modified code from `config-coord-plots` (but actually to
scale and automatically/sensibly parsed etc.)(also see `CarrierCapture` functionalities)
- Option for degeneracy-weighted ('reduced') formation energy diagrams, similar to reduced energies in SOD. See Slack discussion and CdTe pyscfermi notebooks.
- Brouwer diagrams. Also see Fig. 6a of the `AiiDA-defects` preprint, want plotting tools like this (some could be PR'd to `py-sc-fermi`)
- Functions to output data and python objects to plug and play with `py-sc-fermi`, `AiiDA`, `CarrierCapture`.
- Alex Squires has functions/notebooks from Seán (-> Xinwei -> Jiayi) -> Alex, for transferring to
`py-sc-fermi` and generating nice plots with the outputs, so add this and make it our suggested
workflow in the docs etc.
- `py-sc-fermi` may have functionality for dealing with complex defect concentrations in the future
(see Slack with Alex; 07/06/23)
- `pydefect` integration, so we can use:
- `pydefect` integration? So we can use:
- Handling of shallow defects
- Readily automated with `vise` if one wants (easy high-throughput and can setup primitive calcs (BS, DOS, dielectric).
- Some nice defect structure and eigenvalue analysis
- GKFO correction
- Showcase `py-sc-fermi` plotting (e.g. from thesis notebook) using `interface` functionality. When doing, add CdTe data as test case for this part of the code. Could also add an optional right-hand-side y-axis for defect concentration (for a chosen anneal temp) to our TLD plotting (e.g. `concentration_T = None`) as done for thesis, noting in docstring that this obvs doesn't account for degeneracy! Also carrier concentration vs Fermi level plots as done in the Kumagai PRX paper? (once properly integrated, add and ask Alex to check/test?)
Brouwer diagrams; show examples of these in docs using `py-sc-fermi` interface tools. Also see Fig. 6a of the `AiiDA-defects` preprint, want plotting tools like this (some could be PR'd to `py-sc-fermi`)

## Housekeeping
- Clean `README` with bullet-point summary of key features, and sidebar like `SnB`. Add correction plots and other example outputs (see MRS poster for this).
Expand Down Expand Up @@ -130,7 +122,7 @@
possibly `NKRED` etc. (make a function to generate `vasp_ncl` calculation files with `ISMEAR = -5`, with option to set different kpoints) - if `ISMEAR = 0` - converged kpoints still prohibitively large, use vasp_converge_files again to check for quicker convergence with ISMEAR = -5.
- Use `NKRED = 2` for `vasp_ncl` chempot calcs, if even kpoints and over 4. Often can't use `NKRED` with `vasp_std`, because we don't know beforehand the kpts in the IBZ (because symmetry on for `vasp_std` chempot calcs)(same goes for `EVENONLY = True`).
- Readily-usable in conjunction with `atomate`, `AiiDA`(-defects), `CarrierCapture`, and give some
examples. Add as optional dependencies.
quick examples? Add as optional dependencies.
- Workflow diagram with: https://twitter.com/Andrew_S_Rosen/status/1678115044348039168?s=20
- Note about `ISPIN = 1` for even no. of electrons defect species, **if you're sure there's no
magnetic ordering!** – which you can check in the `OUTCAR` by looking at `magnetization (x)` `y`
Expand Down
9 changes: 4 additions & 5 deletions doped/analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -613,10 +613,10 @@ def __init__(
Filename to save the parsed defect entries dict (`DefectsParser.defect_dict`)
to in `output_path`, to avoid having to re-parse defects when later analysing
further and aiding calculation provenance. Can be reloaded using the `loadfn`
function from `monty.serialization` as shown in the docs, or
`DefectThermodynamics.from_json()`. If None (default), set as
"{Chemical Formula}_defect_dict.json" where {Chemical Formula} is the
chemical formula of the host material. If False, no json file is saved.
function from `monty.serialization` (and then input to `DefectThermodynamics`
etc). If None (default), set as "{Chemical Formula}_defect_dict.json" where
{Chemical Formula} is the chemical formula of the host material.
If False, no json file is saved.
Attributes:
defect_dict (dict):
Expand All @@ -625,7 +625,6 @@ def __init__(
defect calculation folder name (_if it is a recognised defect name_),
else it is set to the default `doped` name for that defect.
"""
# TODO: Need to add `DefectThermodynamics.from_json()` etc methods as mention in docstring here
self.output_path = output_path
self.dielectric = dielectric
self.skip_corrections = skip_corrections
Expand Down
43 changes: 17 additions & 26 deletions doped/generation.py
Original file line number Diff line number Diff line change
Expand Up @@ -2212,35 +2212,26 @@ def _sort_defect_entries(defect_entries_dict, element_list=None, symm_ops=None):
except ValueError as value_err:
# possibly defect entries with names not in doped format, try sorting without using name:
try:

def _defect_entry_sorting_func(defect_entry):
name_from_defect = get_defect_name_from_defect(
defect_entry.defect,
element_list=element_list,
symm_ops=symm_ops,
)
return (
defect_entry.defect.defect_type.value,
element_list.index(_first_and_second_element(defect_entry.defect.name)[0]),
element_list.index(_first_and_second_element(defect_entry.defect.name)[1]),
# name without charge:
name_from_defect,
defect_entry.charge_state, # charge state
)

return dict(
sorted(
defect_entries_dict.items(),
key=lambda s: (
s[1].defect.defect_type.value,
element_list.index(
_first_and_second_element(
get_defect_name_from_defect(
s[1].defect,
element_list=element_list,
symm_ops=symm_ops,
)
)[0]
),
element_list.index(
_first_and_second_element(
get_defect_name_from_defect(
s[1].defect,
element_list=element_list,
symm_ops=symm_ops,
)
)[1]
),
# name without charge:
get_defect_name_from_defect(
s[1].defect, element_list=element_list, symm_ops=symm_ops
),
s[1].charge_state, # charge state
),
key=lambda s: _defect_entry_sorting_func(s[1]), # sort by defect entry object
)
)
except ValueError:
Expand Down
55 changes: 31 additions & 24 deletions doped/thermodynamics.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,20 +26,10 @@
from doped.generation import _sort_defect_entries
from doped.utils.plotting import _TLD_plot

# TODO: Previous `pymatgen` issues, fixed?
# - Currently the `PointDefectComparator` object from `pymatgen.analysis.defects.thermodynamics` is
# used to group defect charge states for the transition level plot / transition level map outputs.
# For interstitials, if the closest Voronoi site from the relaxed structure thus differs significantly
# between charge states, this will give separate lines for each charge state. This is kind of ok,
# because they _are_ actually different defect sites, but should have intelligent defaults for dealing
# with this (see `TODO` in `thermo_from_defect_dict` in `analysis.py`; at least similar colours for
# similar defect types, an option to just show amalgamated lowest energy charge states for each
# _defect type_). NaP is an example for this - should have a test built for however we want to handle
# cases like this. See Ke's example case too with different interstitial sites.
# - GitHub issue related to `DefectThermodynamics`: https://github.com/SMTG-Bham/doped/issues/3 -> Think
# about how we want to refactor the `DefectThermodynamics` object!
# - Note that if you edit the entries in a DefectThermodynamics after creating it, you need to
# `thermo.find_stable_charges()` to update the transition level map etc.
# TODO: Add dpd_all_transition_levels function to dope_stuff (or wherever these functions go when
# modularised), which iteratively prints all potential charge transition levels for the input
# DefectPhaseDiagram object, including metastable states. Template using the dpd_transition_levels
# function that's already in dope_stuff; see https://github.com/SMTG-Bham/doped/issues/3 for code


def bold_print(string: str) -> None:
Expand Down Expand Up @@ -117,7 +107,7 @@ class DefectThermodynamics(MSONable):

def __init__(
self,
defect_entries: List[DefectEntry],
defect_entries: Union[List[DefectEntry], Dict[str, DefectEntry]],
chempots: Optional[Dict] = None,
el_refs: Optional[Dict] = None,
vbm: Optional[float] = None,
Expand All @@ -133,13 +123,13 @@ def __init__(
Args:
defect_entries ([DefectEntry]):
A list of DefectEntry objects. Note that `DefectEntry.name` attributes
are used for grouping and plotting purposes! These should end be in
the format "{defect_name}_{optional_site_info}_{charge_state}". If
the DefectEntry.name attribute is not defined or does not end with
the charge state, then the entry will be renamed with the doped
default name.
defect_entries ([DefectEntry] or {str: DefectEntry}):
A list or dict of DefectEntry objects. Note that `DefectEntry.name`
attributes are used for grouping and plotting purposes! These should
be in the format "{defect_name}_{optional_site_info}_{charge_state}".
If the DefectEntry.name attribute is not defined or does not end with
the charge state, then the entry will be renamed with the doped
default name.
chempots (dict):
Dictionary of chemical potentials to use for calculating the defect
formation energies. This can have the form of
Expand Down Expand Up @@ -176,6 +166,8 @@ def __init__(
If None (default), will use "gap" from the calculation_metadata
dict attributes of the DefectEntry objects in `defect_entries`.
"""
if isinstance(defect_entries, dict):
defect_entries = list(defect_entries.values())
self.defect_entries = defect_entries
self.chempots, self.el_refs = _parse_chempots(chempots, el_refs)

Expand Down Expand Up @@ -364,8 +356,6 @@ def from_defect_dict(
Returns:
doped DefectThermodynamics object (DefectThermodynamics)
"""
# TODO: Can we make the thermo generation more efficient? What's the bottleneck in it's
# initialisation? `pymatgen` site-matching that can be avoided?
# TODO: (1) refactor the site-matching to just use the already-parsed site positions, and then
# merge interstitials according to this algorithm:
# 1. For each interstitial defect type, count the number of parsed calculations per charge
Expand All @@ -379,6 +369,18 @@ def from_defect_dict(
# (2) optionally retain/remove unstable (in the gap) charge states (rather than current
# default range of (VBM - 1eV, CBM + 1eV))...
# When doing this, add DOS object attribute, to then use with Alex's doped - py-sc-fermi code.

# Related TODO: Previous `pymatgen` issues, fixed?
# - Currently the `PointDefectComparator` object from `pymatgen.analysis.defects.thermodynamics`
# is used to group defect charge states for the transition level plot / transition level map
# outputs. For interstitials, if the closest Voronoi site from the relaxed structure thus
# differs significantly between charge states, this will give separate lines for each charge
# state. This is kind of ok, because they _are_ actually different defect sites, but should
# have intelligent defaults for dealing with this (at least similar colours for similar defect
# types, an option to just show amalgamated lowest energy charge states for each _defect type_).
# NaP is an example for this - should have a test built for however we want to handle cases like
# this. See Ke's example case too with different interstitial sites.

# TODO: Should loop over input defect entries and check that the same bulk (energy and
# calculation_metadata) was used in each case (by proxy checks that same bulk/defect
# incar/potcar/kpoints settings were used in all cases, from each bulk/defect combination being
Expand Down Expand Up @@ -423,6 +425,11 @@ def _parse_transition_levels(self):
charge state, then the entry will be renamed with the doped default
name.
Note that if the entries in DefectThermodynamics are edited after
initialisation, then DefectThermodynamics._parse_transition_levels()
should be called if the attributes (transition_level_map etc.) and
outputs (plotting and tabulation) are to be updated properly.
Code for parsing the transition levels was originally templated from
the pyCDT (pymatgen<=2022.7.25) thermodynamics code (deleted in later
versions).
Expand Down

0 comments on commit cdfe532

Please sign in to comment.