-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds new accumulo command #5073
base: 2.1
Are you sure you want to change the base?
Conversation
Adds new `accumulo check-server-config-no-inst` command which performs the checks that `accumulo check-server-config` does without the checks that require a running Accumulo instance. Useful for (at least mostly) validating the `accumulo.properties` file without a running instance.
@kevinrr888 - wondering if we could just run the no-instance checks if the error is thrown when an instance doesn't exist so that we can re-use the command? Thoughts? |
I considered that, but thought it might hide some issues. For example, if a user expects that they have a running instance and run |
I was thinking a stern warning could printed to the screen that no instance is running so we are doing a reduced set of checks. I assume someone is going to run this interactively and not via a script. If we don't want to make that assumption, we could return a non-zero code. |
Yeah, I think a warning like that would be a way to replace the extra command. However, another thing I'm thinking of is it might be tricky to identify what a no-instance exception is. There is a certain exception that currently occurs, but that exception may change. |
We could just raise our own exception, NoInstanceException, for example. |
Deleted CheckServerConfigNoInstance and moved functionality to CheckServerConfig
I removed the new check and now check if an instance exists before the full set of checks runs in I wanted to avoid just surrounding the existing code:
with a *to the best of my knowledge |
One issue with this current impl is that the stack trace will still show an exception (this time a
The subset of the checks have still occurred and passed (noted in the output), but the exception unfortunately makes the info less visible and it might be better if the exception was not seen here. |
Regarding the exception being printed, that might be due to the logging configuration for the client (log4j2.properties). If that is set to direct output to a file, then you likely would not see that. For that reason, any output that we have from the command may want to use System.out or System.err |
Gotcha. Changed to print the warning. That logged error from my running of it was my only concern with this, but if it's fine to you then I have no problems with it. |
try { | ||
VolumeManager.getInstanceIDFromHdfs(instanceIdPath, hadoopConfig); | ||
} catch (Exception e) { | ||
System.out.println("WARNING : Performed only a subset of checks (which passed). There is a " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ctubbsii - Do you have thoughts on this and the discussion around it on the conversation tab?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure. It seems to me that the original intent of the check, like most checks, are to check this instance. I'm not sure that this command makes any sense at all to run without an instance, so I think just having an error is fine in those circumstances.
It seems like the use case that #5034 was trying to report was different than this check was written for. That use case seems to be "I want to validate this config file, independently of any Accumulo instance". For that case, I think a much simpler solution can be done... just provide a config file as an optional argument, do SiteConfiguration.fromFile(new File(path)).build()
using the specified file to validate it, rather than using SiteConfiguration.auto()
, and then skip any instance-specific checks after that. After all, that use case is just asking for validation of the contents of a config file, not validation of any particular instance's configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid messing with the original functionality of the CheckServerConfig
check (as it seems like it is working as intended), created new check to validate the Accumulo configuration file in 1255f90. Allows users to optionally provide the config file, as you mentioned. Also see comment.
Edit: removed the ability to accept a file path as an arg in 973aefa. Appears unacceptable in our build constraints:
[ERROR] High: This API (java/io/File.<init>(Ljava/lang/String;)V) reads a file whose location might be specified by user input [org.apache.accumulo.server.conf.CheckAccumuloConfig] At CheckAccumuloConfig.java:[line 44] PATH_TRAVERSAL_IN
Checks the accumulo config at the given file path if one is provided. Otherwise, attempts to find and check the config file. Useful for checking the accumulo config without a running instance.
try { | ||
VolumeManagerImpl.get(siteConfig, hadoopConfig); | ||
} catch (IOException e) { | ||
throw new IllegalStateException(e); | ||
} | ||
new ServerDirs(siteConfig, hadoopConfig); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also included these checks which were the only checks I found (other than SiteConfiguration.<...>.build()
) in the original CheckServerConfig
code which would have done some validation that did not require an instance. These check properties like instance.volumes
and instance.volumes.replacements
(maybe more). Seemed fine to include as they just perform extra checks on instance
related properties without the instance running or needing to be started.
Maybe it would be better if these weren't included or maybe it would be better if more checks were included (would need info on what else to include). I'm not sure. Some input on this would be appreciated.
Appears unnacceptable in our build constraints
closes #5034
I did a deep dive into the code executed by
accumulo check-server-config
to figure out what is being checked to see if an instance can be avoided or what could be checked without an instance. Here is root code for reference:accumulo/server/base/src/main/java/org/apache/accumulo/server/conf/CheckServerConfig.java
Lines 30 to 34 in ed4161e
Here is a summary of my findings:
SiteConfiguration.auto()
does validation: checks that the deprecatedaccumulo-site.xml
file doesn't exist, loads properties from the config file (accumulo.properties
), checks that both deprecated and the replacement properties aren't both declared in the file, replaces deprecated properties with their non-deprecated versions, and returns a new SiteConfiguration object (the SiteConfiguration object constructor callsConfigCheckUtil.validate(config)
to validate the config passed in)context.getConfiguration()
does validation. However, as far as I can tell, the only validation done from this call requires an instance (the only validation seems to stem fromZooBasedConfiguration
/PropSnapshot
/PropStore
/PropStoreKey
/etc.). Validates at least theinstance.secret
property (maybe more). Since this is related to an instance, seems fine that it wouldn't be validated here anyways.new ServerContext(SiteConfiguration.auto())
(=new ServerContext(new ServerInfo(SiteConfiguration.auto()))
) does validation. This creates anew ServerInfo(siteConfig)
passing that to theServerContext
constructor. There is no validation done in theServerContext
constructor: most fields are memoized suppliers (the only one we access isServerConfigurationFactory serverConfFactory
which is accessed throughcontext.getConfiguration()
and we went into that code above). The ones that aren't memoized suppliers don't do any validation. But there is validation done from constructingServerInfo
. We will take what we can (does some form of validation and is not reliant on an instance) and use that in our new check.With this info, I wrote a new check
CheckServerConfigNoInstance
which, to the best of my knowledge, is all the validation that the currentCheckServerConfig
would be able to do without a running instance. So, this PR adds a newaccumulo check-server-config-no-inst
(in addition to the existingaccumulo check-server-config
) which performs the checks thataccumulo check-server-config
does without the checks that require a running Accumulo instance. Useful for (at least mostly) validating theaccumulo.properties
file without an instance.I'm not sure which branch this should target. Targeting 2.1 for now since the change could be made as early as 2.1 and it may be a desireable command here, but not sure if we would want to add a new command in the 2.1 line.
Any input/suggestions on this PR would be appreciated.