Skip to content

Commit

Permalink
Merge pull request #11082 from IQSS/11065-extend-search-api-to-includ…
Browse files Browse the repository at this point in the history
…e-type-counts

include total_count_per_object_type in search response
  • Loading branch information
ofahimIQSS authored Dec 19, 2024
2 parents b329450 + b74eb1a commit bee3cdf
Show file tree
Hide file tree
Showing 4 changed files with 81 additions and 4 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
The JSON payload of the search endpoint has been extended to include total_count_per_object_type for types: dataverse, dataset, and files when the search parameter "&show_type_counts=true" is passed in.
13 changes: 9 additions & 4 deletions doc/sphinx-guides/source/api/search.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,9 @@ Please note that in Dataverse Software 4.3 and older the "citation" field wrappe
Parameters
----------

=============== ======= ===========
================ ======= ===========
Name Type Description
=============== ======= ===========
================ ======= ===========
q string The search term or terms. Using "title:data" will search only the "title" field. "*" can be used as a wildcard either alone or adjacent to a term (i.e. "bird*"). For example, https://demo.dataverse.org/api/search?q=title:data . For a list of fields to search, please see https://github.com/IQSS/dataverse/issues/2558 (for now).
type string Can be either "dataverse", "dataset", or "file". Multiple "type" parameters can be used to include multiple types (i.e. ``type=dataset&type=file``). If omitted, all types will be returned. For example, https://demo.dataverse.org/api/search?q=*&type=dataset
subtree string The identifier of the Dataverse collection to which the search should be narrowed. The subtree of this Dataverse collection and all its children will be searched. Multiple "subtree" parameters can be used to include multiple Dataverse collections. For example, https://demo.dataverse.org/api/search?q=data&subtree=birds&subtree=cats .
Expand All @@ -38,7 +38,8 @@ show_entity_ids boolean Whether or not to show the database IDs of the search
geo_point string Latitude and longitude in the form ``geo_point=42.3,-71.1``. You must supply ``geo_radius`` as well. See also :ref:`geospatial-search`.
geo_radius string Radial distance in kilometers from ``geo_point`` (which must be supplied as well) such as ``geo_radius=1.5``.
metadata_fields string Includes the requested fields for each dataset in the response. Multiple "metadata_fields" parameters can be used to include several fields. The value must be in the form "{metadata_block_name}:{field_name}" to include a specific field from a metadata block (see :ref:`example <dynamic-citation-some>`) or "{metadata_field_set_name}:\*" to include all the fields for a metadata block (see :ref:`example <dynamic-citation-all>`). "{field_name}" cannot be a subfield of a compound field. If "{field_name}" is a compound field, all subfields are included.
=============== ======= ===========
show_type_counts boolean Whether or not to include total_count_per_object_type for types: Dataverse, Dataset, and Files.
================ ======= ===========

Basic Search Example
--------------------
Expand Down Expand Up @@ -701,7 +702,11 @@ The above example ``metadata_fields=citation:dsDescription&metadata_fields=citat
"published_at": "2021-03-16T08:11:54Z"
}
],
"count_in_response": 4
"count_in_response": 4,
"total_count_per_object_type": {
"Datasets": 2,
"Dataverses": 2
}
}
}
Expand Down
10 changes: 10 additions & 0 deletions src/main/java/edu/harvard/iq/dataverse/api/Search.java
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ public Response search(
@QueryParam("metadata_fields") List<String> metadataFields,
@QueryParam("geo_point") String geoPointRequested,
@QueryParam("geo_radius") String geoRadiusRequested,
@QueryParam("show_type_counts") boolean showTypeCounts,
@Context HttpServletResponse response
) {

Expand Down Expand Up @@ -210,6 +211,15 @@ public Response search(
}

value.add("count_in_response", solrSearchResults.size());
if (showTypeCounts && !solrQueryResponse.getTypeFacetCategories().isEmpty()) {
JsonObjectBuilder objectTypeCounts = Json.createObjectBuilder();
for (FacetCategory facetCategory : solrQueryResponse.getTypeFacetCategories()) {
for (FacetLabel facetLabel : facetCategory.getFacetLabel()) {
objectTypeCounts.add(facetLabel.getName(), facetLabel.getCount());
}
}
value.add("total_count_per_object_type", objectTypeCounts);
}
/**
* @todo Returning the fq might be useful as a troubleshooting aid
* but we don't want to expose the raw dataverse database ids in
Expand Down
61 changes: 61 additions & 0 deletions src/test/java/edu/harvard/iq/dataverse/api/SearchIT.java
Original file line number Diff line number Diff line change
Expand Up @@ -1347,4 +1347,65 @@ public void testSearchFilesAndUrlImages() {
.body("data.items[0].url", CoreMatchers.containsString("/datafile/"))
.body("data.items[0]", CoreMatchers.not(CoreMatchers.hasItem("image_url")));
}

@Test
public void testShowTypeCounts() {
//Create 1 user and 1 Dataverse/Collection
Response createUser = UtilIT.createRandomUser();
String username = UtilIT.getUsernameFromResponse(createUser);
String apiToken = UtilIT.getApiTokenFromResponse(createUser);
String affiliation = "testAffiliation";

// test total_count_per_object_type is not included because the results are empty
Response searchResp = UtilIT.search(username, apiToken, "&show_type_counts=true");
searchResp.then().assertThat()
.statusCode(OK.getStatusCode())
.body("data.total_count_per_object_type", CoreMatchers.equalTo(null));

Response createDataverseResponse = UtilIT.createRandomDataverse(apiToken, affiliation);
assertEquals(201, createDataverseResponse.getStatusCode());
String dataverseAlias = UtilIT.getAliasFromResponse(createDataverseResponse);

// create 3 Datasets, each with 2 Datafiles
for (int i = 0; i < 3; i++) {
Response createDatasetResponse = UtilIT.createRandomDatasetViaNativeApi(dataverseAlias, apiToken);
createDatasetResponse.then().assertThat()
.statusCode(CREATED.getStatusCode());
String datasetId = UtilIT.getDatasetIdFromResponse(createDatasetResponse).toString();

// putting the dataverseAlias in the description of each file so the search q={dataverseAlias} will return dataverse, dataset, and files for this test only
String jsonAsString = "{\"description\":\"" + dataverseAlias + "\",\"directoryLabel\":\"data/subdir1\",\"categories\":[\"Data\"], \"restrict\":\"false\" }";

String pathToFile = "src/main/webapp/resources/images/dataverseproject.png";
Response uploadImage = UtilIT.uploadFileViaNative(datasetId, pathToFile, jsonAsString, apiToken);
uploadImage.then().assertThat()
.statusCode(200);
pathToFile = "src/main/webapp/resources/js/mydata.js";
Response uploadFile = UtilIT.uploadFileViaNative(datasetId, pathToFile, jsonAsString, apiToken);
uploadFile.then().assertThat()
.statusCode(200);

// This call forces a wait for dataset indexing to finish and gives time for file uploads to complete
UtilIT.search("id:dataset_" + datasetId, apiToken);
}

// Test Search without show_type_counts
searchResp = UtilIT.search(dataverseAlias, apiToken);
searchResp.then().assertThat()
.statusCode(OK.getStatusCode())
.body("data.total_count_per_object_type", CoreMatchers.equalTo(null));
// Test Search with show_type_counts = FALSE
searchResp = UtilIT.search(dataverseAlias, apiToken, "&show_type_counts=false");
searchResp.then().assertThat()
.statusCode(OK.getStatusCode())
.body("data.total_count_per_object_type", CoreMatchers.equalTo(null));
// Test Search with show_type_counts = TRUE
searchResp = UtilIT.search(dataverseAlias, apiToken, "&show_type_counts=true");
searchResp.prettyPrint();
searchResp.then().assertThat()
.statusCode(OK.getStatusCode())
.body("data.total_count_per_object_type.Dataverses", CoreMatchers.is(1))
.body("data.total_count_per_object_type.Datasets", CoreMatchers.is(3))
.body("data.total_count_per_object_type.Files", CoreMatchers.is(6));
}
}

0 comments on commit bee3cdf

Please sign in to comment.