Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

include total_count_per_object_type in search response #11082

Merged
merged 8 commits into from
Dec 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
The JSON payload of the search endpoint has been extended to include total_count_per_object_type for types: dataverse, dataset, and files when the search parameter "&show_type_counts=true" is passed in.
13 changes: 9 additions & 4 deletions doc/sphinx-guides/source/api/search.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,9 @@ Please note that in Dataverse Software 4.3 and older the "citation" field wrappe
Parameters
----------

=============== ======= ===========
================ ======= ===========
Name Type Description
=============== ======= ===========
================ ======= ===========
q string The search term or terms. Using "title:data" will search only the "title" field. "*" can be used as a wildcard either alone or adjacent to a term (i.e. "bird*"). For example, https://demo.dataverse.org/api/search?q=title:data . For a list of fields to search, please see https://github.com/IQSS/dataverse/issues/2558 (for now).
type string Can be either "dataverse", "dataset", or "file". Multiple "type" parameters can be used to include multiple types (i.e. ``type=dataset&type=file``). If omitted, all types will be returned. For example, https://demo.dataverse.org/api/search?q=*&type=dataset
subtree string The identifier of the Dataverse collection to which the search should be narrowed. The subtree of this Dataverse collection and all its children will be searched. Multiple "subtree" parameters can be used to include multiple Dataverse collections. For example, https://demo.dataverse.org/api/search?q=data&subtree=birds&subtree=cats .
Expand All @@ -38,7 +38,8 @@ show_entity_ids boolean Whether or not to show the database IDs of the search
geo_point string Latitude and longitude in the form ``geo_point=42.3,-71.1``. You must supply ``geo_radius`` as well. See also :ref:`geospatial-search`.
geo_radius string Radial distance in kilometers from ``geo_point`` (which must be supplied as well) such as ``geo_radius=1.5``.
metadata_fields string Includes the requested fields for each dataset in the response. Multiple "metadata_fields" parameters can be used to include several fields. The value must be in the form "{metadata_block_name}:{field_name}" to include a specific field from a metadata block (see :ref:`example <dynamic-citation-some>`) or "{metadata_field_set_name}:\*" to include all the fields for a metadata block (see :ref:`example <dynamic-citation-all>`). "{field_name}" cannot be a subfield of a compound field. If "{field_name}" is a compound field, all subfields are included.
=============== ======= ===========
show_type_counts boolean Whether or not to include total_count_per_object_type for types: Dataverse, Dataset, and Files.
================ ======= ===========

Basic Search Example
--------------------
Expand Down Expand Up @@ -701,7 +702,11 @@ The above example ``metadata_fields=citation:dsDescription&metadata_fields=citat
"published_at": "2021-03-16T08:11:54Z"
}
],
"count_in_response": 4
"count_in_response": 4,
"total_count_per_object_type": {
"Datasets": 2,
"Dataverses": 2
}
}
}

Expand Down
10 changes: 10 additions & 0 deletions src/main/java/edu/harvard/iq/dataverse/api/Search.java
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ public Response search(
@QueryParam("metadata_fields") List<String> metadataFields,
@QueryParam("geo_point") String geoPointRequested,
@QueryParam("geo_radius") String geoRadiusRequested,
@QueryParam("show_type_counts") boolean showTypeCounts,
@Context HttpServletResponse response
) {

Expand Down Expand Up @@ -210,6 +211,15 @@ public Response search(
}

value.add("count_in_response", solrSearchResults.size());
if (showTypeCounts && !solrQueryResponse.getTypeFacetCategories().isEmpty()) {
JsonObjectBuilder objectTypeCounts = Json.createObjectBuilder();
for (FacetCategory facetCategory : solrQueryResponse.getTypeFacetCategories()) {
for (FacetLabel facetLabel : facetCategory.getFacetLabel()) {
objectTypeCounts.add(facetLabel.getName(), facetLabel.getCount());
}
}
value.add("total_count_per_object_type", objectTypeCounts);
}
/**
* @todo Returning the fq might be useful as a troubleshooting aid
* but we don't want to expose the raw dataverse database ids in
Expand Down
61 changes: 61 additions & 0 deletions src/test/java/edu/harvard/iq/dataverse/api/SearchIT.java
Original file line number Diff line number Diff line change
Expand Up @@ -1347,4 +1347,65 @@ public void testSearchFilesAndUrlImages() {
.body("data.items[0].url", CoreMatchers.containsString("/datafile/"))
.body("data.items[0]", CoreMatchers.not(CoreMatchers.hasItem("image_url")));
}

@Test
public void testShowTypeCounts() {
//Create 1 user and 1 Dataverse/Collection
Response createUser = UtilIT.createRandomUser();
String username = UtilIT.getUsernameFromResponse(createUser);
String apiToken = UtilIT.getApiTokenFromResponse(createUser);
String affiliation = "testAffiliation";

// test total_count_per_object_type is not included because the results are empty
Response searchResp = UtilIT.search(username, apiToken, "&show_type_counts=true");
searchResp.then().assertThat()
.statusCode(OK.getStatusCode())
.body("data.total_count_per_object_type", CoreMatchers.equalTo(null));

Response createDataverseResponse = UtilIT.createRandomDataverse(apiToken, affiliation);
assertEquals(201, createDataverseResponse.getStatusCode());
String dataverseAlias = UtilIT.getAliasFromResponse(createDataverseResponse);

// create 3 Datasets, each with 2 Datafiles
for (int i = 0; i < 3; i++) {
Response createDatasetResponse = UtilIT.createRandomDatasetViaNativeApi(dataverseAlias, apiToken);
createDatasetResponse.then().assertThat()
.statusCode(CREATED.getStatusCode());
String datasetId = UtilIT.getDatasetIdFromResponse(createDatasetResponse).toString();

// putting the dataverseAlias in the description of each file so the search q={dataverseAlias} will return dataverse, dataset, and files for this test only
String jsonAsString = "{\"description\":\"" + dataverseAlias + "\",\"directoryLabel\":\"data/subdir1\",\"categories\":[\"Data\"], \"restrict\":\"false\" }";

String pathToFile = "src/main/webapp/resources/images/dataverseproject.png";
Response uploadImage = UtilIT.uploadFileViaNative(datasetId, pathToFile, jsonAsString, apiToken);
uploadImage.then().assertThat()
.statusCode(200);
pathToFile = "src/main/webapp/resources/js/mydata.js";
Response uploadFile = UtilIT.uploadFileViaNative(datasetId, pathToFile, jsonAsString, apiToken);
uploadFile.then().assertThat()
.statusCode(200);

// This call forces a wait for dataset indexing to finish and gives time for file uploads to complete
UtilIT.search("id:dataset_" + datasetId, apiToken);
}

// Test Search without show_type_counts
searchResp = UtilIT.search(dataverseAlias, apiToken);
searchResp.then().assertThat()
.statusCode(OK.getStatusCode())
.body("data.total_count_per_object_type", CoreMatchers.equalTo(null));
// Test Search with show_type_counts = FALSE
searchResp = UtilIT.search(dataverseAlias, apiToken, "&show_type_counts=false");
searchResp.then().assertThat()
.statusCode(OK.getStatusCode())
.body("data.total_count_per_object_type", CoreMatchers.equalTo(null));
// Test Search with show_type_counts = TRUE
searchResp = UtilIT.search(dataverseAlias, apiToken, "&show_type_counts=true");
searchResp.prettyPrint();
searchResp.then().assertThat()
.statusCode(OK.getStatusCode())
.body("data.total_count_per_object_type.Dataverses", CoreMatchers.is(1))
.body("data.total_count_per_object_type.Datasets", CoreMatchers.is(3))
.body("data.total_count_per_object_type.Files", CoreMatchers.is(6));
}
}
Loading