Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Composable index template mapping can prevent creating an index if mappings conflict #16340

Closed
dbwiddis opened this issue Oct 15, 2024 · 9 comments · Fixed by #16418
Closed
Assignees
Labels
bug Something isn't working Indexing Indexing, Bulk Indexing and anything related to indexing v2.19.0 Issues and PRs related to version 2.19.0 v3.0.0 Issues and PRs related to version 3.0.0

Comments

@dbwiddis
Copy link
Member

dbwiddis commented Oct 15, 2024

Describe the bug

In the index templates documentation for Composable index templates, it states:

Settings and mappings that you specify directly in the create index request override any settings or mappings specified in an index template and its component templates.

This documentation is not accurate.

Specifically, overriding is not possible if the conflicting type can't be merged (e.g., between object and nested, field and object, etc.). Overriding only occurs on "leaf" components. (See ab65a57)

Worse, this mapping conflict prevents index creation, including system indices.

A user could accidentally add a wrong mapping and potentially prevent a node from even starting up: opensearch-project/observability#1872

Related component

Indexing

To Reproduce

  1. Create a component template with an object or nested mapping for a field ("error" in this case):
PUT _component_template/error_mapping_template
{
  "template": {
    "mappings": {
      "properties": {
        "error" : {
          "type": "nested",
          "properties": {
            "message": {
              "type": "text"
            },
            "status": {
              "type": "integer"
            }
          }
        }
      }
    }
  }
}
  1. Create an index template using that composable template mapping:
PUT _index_template/match_all_template
{
  "index_patterns": [
    "*"
  ],
  "template": {
    "settings": {}
  },
  "priority": 10,
  "composed_of": [
    "error_mapping_template"
  ],
  "version": 1
}
  1. Attempt to create an index with a conflicting field mapping (including a system index):
PUT /.plugins-foo-system-index
{
  "mappings": {
    "properties": {
      "error": {
        "type": "text"
      }
    }
  }
}
  1. Receive the below error, and observe that the index did not successfully create.
{
    "error": {
        "root_cause": [
            {
                "type": "illegal_argument_exception",
                "reason": "can't merge a non object mapping [error] with an object mapping"
            }
        ],
        "type": "illegal_argument_exception",
        "reason": "can't merge a non object mapping [error] with an object mapping"
    },
    "status": 400
}

Expected behavior

The mapping specified in the create index API call overrides the template mapping, and the index is actually created.

Additional Details

Plugins

N/A. Reproduced on a local cluster using HEAD of 2.x branch, using ./gradlew clean run.

However, this is a significant issue for any plugin that uses a system index. It's a critical issue for plugins that prevent a node from starting up if the system index is not created.

Host/Environment (please complete the following information):

Reproducable locally on macOS; occurred on a production cluster on AOS, OpenSearch version 2.15

Additional context

This is significant in that:

  1. Many plugins use system indices
  2. These index templates are in control of the user and yet conflict with system indices
  3. There's no easy way for a plugin, when creating a system index, to avoid this conflict or specify "don't use templates at all", essentially allowing a user to break several plugins without really even knowing it.

In this case, to work around the bug I had to create a (temporary) higher priority matching template with an empty object in the mapping field to prevent the incompatible mapping.

Stack trace leading to the error:

java.lang.IllegalArgumentException: can't merge a non object mapping [error] with an object mapping
        at org.opensearch.index.mapper.ObjectMapper.merge(ObjectMapper.java:767) ~[opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.index.mapper.ObjectMapper.doMerge(ObjectMapper.java:798) ~[opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.index.mapper.RootObjectMapper.doMerge(RootObjectMapper.java:364) ~[opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.index.mapper.ObjectMapper.merge(ObjectMapper.java:771) ~[opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.index.mapper.RootObjectMapper.merge(RootObjectMapper.java:359) ~[opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.index.mapper.Mapping.merge(Mapping.java:130) ~[opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.index.mapper.DocumentMapper.merge(DocumentMapper.java:310) ~[opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.index.mapper.MapperService.internalMerge(MapperService.java:521) ~[opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.index.mapper.MapperService.internalMerge(MapperService.java:483) ~[opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.index.mapper.MapperService.merge(MapperService.java:447) ~[opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.cluster.metadata.MetadataCreateIndexService.updateIndexMappingsAndBuildSortOrder(MetadataCreateIndexService.java:1441) ~[opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.cluster.metadata.MetadataCreateIndexService.lambda$applyCreateIndexWithTemporaryService$3(MetadataCreateIndexService.java:523) [opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.indices.IndicesService.withTempIndexService(IndicesService.java:982) [opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexWithTemporaryService(MetadataCreateIndexService.java:514) [opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexRequestWithV2Template(MetadataCreateIndexService.java:783) [opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexRequest(MetadataCreateIndexService.java:457) [opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexRequest(MetadataCreateIndexService.java:483) [opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.cluster.metadata.MetadataCreateIndexService$1.execute(MetadataCreateIndexService.java:389) [opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
        at org.opensearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:67) [opensearch-2.18.0-SNAPSHOT.jar:2.18.0-SNAPSHOT]
@dbwiddis dbwiddis added bug Something isn't working untriaged labels Oct 15, 2024
@github-actions github-actions bot added the Indexing Indexing, Bulk Indexing and anything related to indexing label Oct 15, 2024
@dbwiddis dbwiddis changed the title [BUG] Composable index template mapping takes priority over Create Index mapping [BUG] Composable index template mapping can prevent creating an index Oct 15, 2024
@dbwiddis dbwiddis changed the title [BUG] Composable index template mapping can prevent creating an index [BUG] Composable index template mapping can prevent creating an index if mappings conflict Oct 15, 2024
@sandeshkr419
Copy link
Contributor

To be honest, I think the existing behavior makes sense. You apply a template as a rule to be followed with indices to be matched, and it follows as directed to.

Now to get to the point how to obtain the desired behavior. I think there is a work-around which makes sense. (There may be better ways to do it which I'm not aware).

So basically let's say you want to create a different rule for your system indices. So,

  1. Create a new component template
PUT another _component_template: error_mapping_template2
{
  "template": {
    "mappings": {
      "properties": {
        "error" : {
              "type": "text"
            },
            "status": {
              "type": "integer"
            }
          }
        }
      }
    }
  }
}
  1. Create another index-template with a higher priority for a sub-set of indices:
PUT _index_template/match_all_template
{
  "index_patterns": [
    "system*"
  ],
  "template": {
    "settings": {}
  },
  "priority": 20,
  "composed_of": [
    "error_mapping_template2"
  ],
  "version": 1
}
  1. Now your system* patterns will obey your higher priority template.
PUT /system-index
{
  "mappings": {
    "properties": {
      "error": {
        "type": "text"
      }
    }
  }
}

(this succeeds)

@dbwiddis
Copy link
Member Author

dbwiddis commented Oct 16, 2024

To be honest, I think the existing behavior makes sense. You apply a template as a rule to be followed with indices to be matched, and it follows as directed to.

It's an overridable rule. You can (and the documentation says as much) override any template field mapping; the templates are just defaults.

But there are some mappings that you can't override, and that's not documented and not obvious to users, and the error message when it doesn't work is unhelpful. It took 2 weeks and 5 devs finding this error "in the wild" to dig deep enough to figure out this undocumented behavior.

I think there is a work-around which makes sense.

Yes, I described the workaround with a higher-priority template. You don't even need the complexity of a new composable template, just one with empty mappings:

PUT _index_template/temporary_high_priority
{
  "index_patterns": [
    ".*"
  ],
  "template": {
    "mappings": {}
  },
  "priority": 1000000
}

This works fine for "fixing" a plugin, but in order to use this workaround for a plugin that wants to, on initialization, consistently and reliably create a system index with known mappings, you would have to:

  • query the existing templates
  • iterate the templates and use some fancy regex or glob match to identify matching ones, or just assume all match and find the max priority. Also save a set of their names.
  • increment that max priority
  • create a temporary template making sure the name doesn't conflict, with the highest priority
  • create your system index
  • delete your temporary template.

And that assumes the node is started up. Some plugins create indices on startup and could kill the node before it even gets that far: opensearch-project/observability#1872

Compare that to alternative solutions like:

  • have a "use_templates=false" parameter for create index
  • have a "template" : "none" field in the create index body
  • forbid user match-all ([*]) templates from matching against system indices
  • provide a programmatic way for plugins creating indices to prevent templates from being used, even if it's not exposed in the REST API

If we're not going to allow full override behavior I'd prefer one of the above solutions, to give plugins (or ideally any index creator) the power to completely override a template. Requiring plugins to go through the "check all the templates to make sure you're not going to be stepped on" shenanigans is not scalable.

@zane-neo
Copy link
Contributor

System index shouldn't follow the index template created by user for their business purpose. I vote for filtering out the system index creation from index template.

@dbwiddis
Copy link
Member Author

System index shouldn't follow the index template created by user for their business purpose. I vote for filtering out the system index creation from index template.

@zane-neo Thanks for this suggestion. I generally agree. Follow-up question: is there a specific definition of "system index" that we can use for this filtering? Is it as simple as a leading dot, or does it need to be "registered"? (CC: @cwperks for comment here)

@cwperks
Copy link
Member

cwperks commented Oct 21, 2024

Is it as simple as a leading dot, or does it need to be "registered"? (CC: @cwperks for comment here)

@dbwiddis Conventionally they all begin with dot, but formally they must be registered with SystemIndexPlugin.getSystemIndexDescriptors. The registry is formed here.

@dbwiddis
Copy link
Member Author

Now to just figure out where to fetch that object when matching indices with a template...

@cwperks
Copy link
Member

cwperks commented Oct 21, 2024

And one clarification for my comment above. The system index registration supports "patterns". When a concrete index is made that matches the pattern, it is considered a system index.

For instance if the alerting plugin registers .opendistro-alerting*, then as soon as .opendistro-alerting-config, .opendistro-alerting-detectors, etc. are created they are system indices since they match one of the registered patterns.

System indices may also be auto created. If you ingest a document and it doesn't exist, it gets automatically created. See test for details.

@dbwiddis
Copy link
Member Author

dbwiddis commented Oct 21, 2024

Looks like systemIndices is already present in the MetadataCreateIndexService:

We also already give special treatment to hidden indices here:

// Hidden indices apply templates slightly differently (ignoring wildcard '*'
// templates), so we need to check to see if the request is creating a hidden index
// prior to resolving which templates it matches
final Boolean isHiddenFromRequest = IndexMetadata.INDEX_HIDDEN_SETTING.exists(request.settings())
? IndexMetadata.INDEX_HIDDEN_SETTING.get(request.settings())
: null;
// The backing index may have a different name or prefix than the data stream name.
final String name = request.dataStreamName() != null ? request.dataStreamName() : request.index();
// Check to see if a v2 template matched
final String v2Template = MetadataIndexTemplateService.findV2Template(
currentState.metadata(),
name,
isHiddenFromRequest == null ? false : isHiddenFromRequest
);

So we probably just need to alter the logic here and below to set null for v2 template and an empty list for v1 template.

@pyek-bot
Copy link
Contributor

pyek-bot commented Oct 22, 2024

Can you assign this issue to me please?

Created a PR for this fix: #16418

Please take a look and advice for backporting

@dbwiddis @cwperks @zane-neo

@reta reta added v3.0.0 Issues and PRs related to version 3.0.0 v2.19.0 Issues and PRs related to version 2.19.0 labels Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Indexing Indexing, Bulk Indexing and anything related to indexing v2.19.0 Issues and PRs related to version 2.19.0 v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants