A package for doing checks on the Tree Swallow databases. Will print a list with the results (problematic lines) of the different checks executed.
To install, do:
# install and load devtools to be able to install packages from GitHub with install_github
install.packages("devtools")
library(devtools)
# install package
install_github("frousseu/BDTREScheck")
library(BDTREScheck)
?checkBD
adults: ferme, nichoir, id, annee, nnich, idcouvee, heure, jjulien, prefixe, suffixe, idadult, condition, sexe_morpho, age_morpho, sexe_gen, locus_sexe_gen, couleur, age_exact, laile1, laile2, masse, tarse1, tarse2, trougauche, troudroite, pararectrice, plaqueincu, Cause_recapt, commentaire, observateur
chicks: ferme, nichoir, id, annee, nnich, idcouvee, heure, jjulien, prefixe, suffixe, idois, sexe_gen, locus_sexe_gen, condition, numero_oisillon, jour_suivi, envol, masse, 9primaires1, 9primaires2, tarse1, tarse2, commentaires, manipulateur
broods: idcouvee, id, ferme, nichoir, annee, codesp, nnich, noeufs, noisnes, noisenvol, noismort, dispa_ois, dispa_oeufs, abandon, pred_pot, dponte, dincub, declomin, declomax, denvomin, denvomax, dabanmin, dabanmax, idF1, idM1, idF2, idF3, idM2, idM3, Commentaires
Run checks:
library(BDTREScheck)
x<-checkBD(
dsn="//argus.dinf.fsci.usherbrooke.ca/DBio_Rech_Data/Projet_Hirondelle/1_Donnees/11_Principales",
year=2015,
adultsNew=NULL,
broodsNew=NULL,
chicksNew=NULL,
adultsOld="Adultes_2004-2015.xlsx",
broodsOld="Couvee_2004-2015.xlsx",
chicksOld="Oisillons_2004-2015.xlsx",
sheet=1
)
Display results:
x
x[4]
Show the list of checks:
checkShow(html=FALSE)
- add part at the end of Nghia script (already done?)
- find what to do with read_excel warnings on expectation about column format
- change default values of databases names
- think of a way to remove warnings associated with the check on the number of characters based on column names that are not in all databases
- think about the input db type given and arguments options (a data.frame already in the environment, a path to an excel, csv file, a remote database etc.)
- make a few other checks already described, but not code
- looking for more than one label for sex gives a lot of results (with M or F and I), so it may not be the thing to do
ID | Checks |
---|---|
1 | "GENERAL: Remove rows with NA id's in broodsNew" |
2 | "GENERAL: Remove rows with NA id's in adultsNew" |
3 | "GENERAL: Remove rows with NA id's in chicksNew" |
4 | "GENERAL: Are column names in adult consistent?" |
5 | "GENERAL: Are column names in chicksNew consistent?" |
6 | "GENERAL: Are column names in broodsNew consistent?" |
7 | "GENERAL: Are column names consistant across old and new databases" |
8 | "ADULTS: Wrong ferme id observed" |
9 | "NESTLINGS: Wrong ferme id observed" |
10 | "BROODS: Wrong ferme id observed" |
11 | "ADULTS: Wrong nichoir id observed" |
12 | "NESTLINGS: Wrong nichoir id observed" |
13 | "BROODS: Wrong nichoir id observed" |
14 | "ADULTS: id column doesn't correspond to ferme + nichoir id" |
15 | "NESTLINGS: id column doesn't correspond to ferme + nichoir id" |
16 | "BROODS: id column doesn't correspond to ferme + nichoir id" |
17 | "ADULTS: idcouv doesn't correspond to ferme + nichoir + annee + nnich column" |
18 | "NESTLINGS: idcouv doesn't correspond to ferme + nichoir + annee + nnich column" |
19 | "BROODS: idcouv doesn't correspond to ferme + nichoir + annee + nnich column" |
20 | "ADULTS: Wrong prefixe name observed (is it a new prefixe?)" |
21 | "ADULTS: Wrong suffixe name observed (too short/long)" |
22 | "ADULTS: idadult doesn't correspond to prefixe + suffixe" |
23 | "ADULTS: idadult doesn't correspond to id + year + LETTER" |
24 | "NESTLINGS: Wrong prefixe name observed (is it a new prefixe?)" |
25 | "NESTLINGS: Wrong suffixe name observed (too short/long)" |
26 | "NESTLINGS: idois doesn't correspond to prefixe + suffixe" |
27 | "NESTLINGS: idois doesn't correspond to idcouvee + numero_oisillon" |
28 | "ADULTS: Show all unique values in columns for which the number of possible values is restricted" |
29 | "BROODS: Show all unique values in columns for which the number of possible values is restricted" |
30 | "NESTLINGS: Show all unique values in columns for which the number of possible values is restricted" |
31 | "ADULTS/BROODS: Females assigned to an idcouv in adults db but no female is assigned to this idcouv in broods db (check capture dates)" |
32 | "ADULTS/BROODS: Females assigned to an idcouv in adults db but not referenced in broods db (idF2 or idF3)" |
33 | "ADULTS/BROODS: Females captured at a nestbox but not assigned to an idcouv in adults db (check nnich)" |
34 | "ADULTS/BROODS: Females captured at a nestbox but assigned to a wrong idcouv in adults db (check nnich)" |
35 | "ADULTS/BROODS: Males assigned to an idcouv in adults db but no male is assigned to this idcouv in broods db (check capture dates or morpho/gen sex identity)" |
36 | "ADULTS/BROODS: Males assigned to an idcouv in adults db but not referenced in broods db (idM2 or idM3)" |
37 | "ADULTS/BROODS: Males captured at a nestbox but not assigned to an idcouv in adults db (check nnich)" |
38 | "ADULTS/BROODS: Males captured at a nestbox but assigned to a wrong idcouv in adults db (check nnich)" |
39 | "BROODS: Female referenced as second female when only one female captured (change to idF1 - some exceptions possible)" |
40 | "BROODS: Male referenced as second males when only one male captured (change to idM1)" |
41 | "BROODS: Two males captured in the same nestbox but not properly reported (idM2 and idM3, not idM1)" |
42 | "BROODS: Males (idM1) assigned to brood with no nestlings (not coherent)" |
43 | "ADULTS: Sex/age incoherencies (few exceptions when condition !=0)" |
44 | "ADULTS: Some colors not in the list of possible values?" |
45 | "ADULTS: Brown females (>50%) not assigned to SY?" |
46 | "ADULTS: Individual with a couleur assigned, but without morpho_age (no check for condition != 0)" |
47 | "ADULTS: Dead individual with a couleur assigned, but without morpho_age (probably nothing to do)" |
48 | "ADULTS: Capture time outside 06:00 and 20:40 (max)" |
49 | "NESTLINGS: Capture time outside 06:00 and 20:40 (max)" |
50 | "ADULTS: Wrong sexe_gen/locus_sexe_gen association (both NA or with values)" |
51 | "ADULTS: Check for adults with changing sexe_morph (within the current breeding season ONLY)" |
52 | "ADULTS: Check for adults with changing sexe_morph (across seasons - MUST be concordant (see Donnees_Codes.docx))" |
53 | "ADULTS: Check for adults with changing sexe_gen (within the current breeding season ONLY)" |
54 | "ADULTS: Check for adults with changing sexe_gen (across seasons)" |
55 | "ADULTS: Check for adults with changing locus_sexe_gen (within the current breeding season ONLY)" |
56 | "ADULTS: Check for adults with changing locus_sexe_gen (across seasons)" |
57 | "ADULTS: Missing one wing measurement" |
58 | "ADULTS: Missing one tarsus measurement" |
59 | "ADULTS: Wing measurement outside the range of likely values (105-128 mm)" |
60 | "ADULTS: Wing measurement 1 and 2 too far apart (>1 mm)" |
61 | "ADULTS: Weight measurements outside the range of likely values (15-30g)" |
62 | "ADULTS: Tarsus measurements outside the range of likely values (10-14 mm)" |
63 | "ADULTS: tarsus measurement 1 and 2 too far apart (>0.1 mm)" |
64 | "ADULTS: Wrong condition status" |
65 | "ADULTS: Wrong plaqueincu status" |
66 | "ADULTS: Male with brood patch (plaqueincu)" |
67 | "ADULTS: Wrong Cause_capture status" |
68 | "ADULTS: Visits are not all 2 days apart for the following farms (maybe caused by presence of NAs)" |
69 | "NESTLINGS: Visits are not all 2 days apart for the following farms (maybe caused by presence of NAs)" |
70 | "ADULTS: Check for duplicates using all columns" |
71 | "BROODS: Check for duplicates using all columns" |
72 | "CHICKS: Check for duplicates using all columns" |
73 | "ADULTS: Check for adults with more than one entry for a single date (probably duplicated lines - remove one)" |
74 | "NESTLINGS: Check for chicks with more than one entry for a single date (probably duplicated lines - remove one)" |
75 | "NESTLINGS: Check for chicks with more than one entry for a single age" |
76 | "ADULTS: Check for adults found at more than one farm (maybe not an error)" |
77 | "NESTLINGS: Check for nestlings found at more than one nestbox" |
78 | "NESTLINGS/BROODS: Capture date of young is later than the minimal abandonment date if nest was abandoned" |
79 | "NESTLINGS/BROODS: Capture date of young is before the laying date" |
80 | "NESTLINGS/BROODS: jjulien of young that doesn't correspond to declomax + jour_suivi" |
81 | "NESTLINGS: Wrong sexe_gen/locus_sexe_gen association (both NA or with values)" |
82 | "NESTLINGS: Check for individuals with changing sexe_gen" |
83 | "NESTLINGS: Check for individuals with changing locus_sexe_gen" |
84 | "NESTLINGS: Wrong chick conditions (4 possible values; vivant, disparu, mort or disparuj16)" |
85 | "NESTLINGS: Dead or disappeared nestlings without a 0 for flight code (few exceptions possibles, see comments)" |
86 | "NESTLINGS: Nestling with disparuj16 condition but without a 1 for flight code" |
87 | "NESTLINGS: Make sure that living nestlings with a 0 flight code are eventually dead or disappeared" |
88 | "NESTLINGS: Make sure that no nestling comes back to life" |
89 | "NESTLINGS: Check that numero_oisillon are from 1 to nb of nestlings" |
90 | "NESTLINGS: Nestlings which were followed for 12 days or more should have a band number as id and otherwise they should have a farm/brood id (maybe an exception, comments [Oisillon non bagué car trop petit à J12])" |
91 | "NESTLINGS: Chicks for which there is a band number but it does not correspond to the id of the chick" |
92 | "NESTLINGS: Missing one wing measurement" |
93 | "NESTLINGS: Missing one tarsus measurement" |
94 | "NESTLINGS: 9primaires larger than expected (65 mm, no age consideration)" |
95 | "NESTLINGS: 9primaires outside the range of likely values at 6-day-old (0 - 10 mm)" |
96 | "NESTLINGS: 9primaires outside the range of likely values at 12-day-old (5 - 45 mm)" |
97 | "NESTLINGS: 9primaires outside the range of likely values at 16-day-old (15 - 65 mm)" |
98 | "NESTLINGS: 9primaires 1 and 2 too far apart (>0.1 mm)" |
99 | "NESTLINGS: Weight measurements larger than expected (27 g, no age consideration)" |
100 | "NESTLINGS: Weight measurements outside the range of likely value at 2-days-old (1-8 g)" |
101 | "NESTLINGS: Weight measurements outside the range of likely value at 6-days-old (2-20 g)" |
102 | "NESTLINGS: Weight measurements outside the range of likely value at 12-days-old (10-27 g)" |
103 | "NESTLINGS: Weight measurements outside the range of likely value at 16-days-old (12-27 g)" |
104 | "NESTLINGS: Tarsus measurements outside the range of likely values (10-14 mm)" |
105 | "NESTLINGS: Tarsus measurement 1 and 2 too far apart (>0.1 mm)" |
106 | "NESTLINGS/BROODS: Broods that are in chicks db but not in broods db" |
107 | "NESTLINGS/BROODS: TRES broods with at least one nestling that are in broods db but not in chicks db" |
108 | "ADULTS/NESTLINGS: Check for individuals with changing sexe_gen across db" |
109 | "ADULTS: Check for either missing or wrong age_exact column for individuals hatched in our study system" |
110 | "BROODS: Check for duplicates in idcouvee" |
111 | "BROODS: Check for duplicates in id/nnich (change nnich)" |
112 | "BROODS: Check for missing id (add lines for them)" |
113 | "BROODS: Wrong codesp (some other species were exceptionnally ringed)" |
114 | "BROODS: checks if the nnich number is good assuming only one line per brood" |
115 | "BROODS: Wrong abandon / pred_pot" |
116 | "BROODS: Wrong chronology in events within a brood" |
117 | "BROODS: Broods with more nestlings than eggs" |
118 | "BROODS: More/less nestlings than nestling status (noines != noisenvol + noismort + dispa_ois)" |
119 | "BROODS: Too much eggs/nestlings within the same brood (8 and more; few exception possible, see comments)" |
120 | "BROODS: Event dates outside the range of possible values (JJ 95-220)" |
121 | "BROODS: No fledging or abandon date for TRES broods (exception possibles, see comments" |
122 | "BROODS: Missing 1 value in declo (min or max)" |
123 | "BROODS: Missing 1 value in denvo for TRES (min or max)" |
124 | "BROODS: Wrong denvomin for other species (should be NA)" |
125 | "BROODS: Missing 1 value in daban (min or max)" |
126 | "BROODS: Very long time elapse between laying date and incubation initiation (> 2 weeks; 2 different broods?)" |
127 | "BROODS: Very short time elapse between laying date and incubation initiation (< 5 days)" |
128 | "BROODS: Very long time elapse between laying date and hatching date (> 4 weeks; 2 different broods?)" |
129 | "BROODS: Very short time elapse between laying date and hatching date (< 10 days)" |
130 | "BROODS: Very long time elapse between incubation initiation and hatching date (> 2 weeks; 2 different broods?)" |
131 | "BROODS: Very short time elapse between incubation initiation and hatching date (< 1 week)" |
132 | "BROODS: Too long time elapse between minimum and maximum hatching date (> 1 day)" |
133 | "BROODS: Too long time elapse between minimum and maximum abandon date (> 1 day) when nestlings >=1" |
134 | "BROODS: Too long time elapse between minimum and maximum fledging date (> 1 week)" |
135 | "BROODS: Too short time elapse between minimum and maximum fledging date (< 1 day; exception possible, see comments)" |
136 | "NESTLINGS/BROODS: Inconsistency in the number of nestlings between databases (NESTLINGS: Nois, Nenvol, Ndead, Ndispa - MAYBE some nestlings with num_ois = NA)" |