-
Notifications
You must be signed in to change notification settings - Fork 12
CDAP endpoint no longer available if Data Fusion instance updated. #102
Comments
It seems the namespace was deleted, which should trigger all the resources within the namespace to be deleted. There are existence checks on all the resources e.g.
but it might not be checking if the namespace still exists. To do that we need to call
I don't work on this project actively anymore and don't have the bandwidth to support this, but I am happy to review a PR if someone wants to add this. Thanks! |
Hello @umairidris In fact, this issue in not resulting of a namespace deletion, this happens to all CDAP resources when we make a change forcing the replacement of the data fusion instance. # CDAP provider needed for each Data Fusion instance.
provider "cdap" {
host = "${module.datafusion_instance.wait_healthy_service_endpoint}/api/"
token = data.google_client_config.current.access_token
} Because the provider is configured by the output on the data fusion instance resources (in this case a script that check that the instance is really created and ready), when terraform need to recreate the resource, it reinitializes the value of this parameter Do you know if there is a way to configure terraform in order to say : If you have to recreate the instance just ignore all CDAP resources because they will be recreated. |
Thanks for the additional context. If we remove the wait healthy module, does it work correctly? i.e. a CDF recreation correctly recreates all cdap resources? If so, then it seems we just need to work with the wait healthy module. Maybe we need to add some triggers in the null resource to correctly recreate when cdf instance is recreated? Another best practice could be to separate cdf instance and cdap resources into separate states and apply them separately. If that doesn't work, it may require fixes in the provider existence checks that I mentioned above. |
@umairidris do you know if anyone is actively developing on this provider anymore? |
Hi @jrstarke, no this provider is not currently being developed. If you would like to send bug fixes or feature improvements then I am happy to review and merge them on a voluntary basis. If you would like this provider to be better maintained I would suggest sending feedback to the data fusion team: https://cloud.google.com/data-fusion/docs/support/getting-support. |
Hello, @umairidris , to get back to this issue that still persists for me, the reason why we added the Perhaps this issue we encountered is no longer relevant ? I will check as you requested if it works when we remove the Assuming our issue comes from our implementation of the |
Hi @FVCCarneiro, My apologies, this completely slipped past me! If you (or someone) is still dealing with this, the first step would be to see if it is an issue with the wait_healthy module. After the initial creation of the null_resource in the wait_healthy module, the CDF instance URL is saved into the TF state and it does not call the script again to check if the instance is still OK. If they are in the same deployment, a recreation of the instance should cause any dependent resources to be recreated. But if they are not in the same deployment we would need to add a trigger to recheck the state of the instance in wait_healthy module. The null resource https://registry.terraform.io/providers/hashicorp/null/latest/docs/resources/resource has some fields we could set for triggers. Longer term we should not rely on the wait_healthy module. Waiting for health should be done by the CDF instance itself (e.g. we should be able to configure it as an option on the instance). Please file a ticket with the CDF team to add support for this. In the provider, we should ensure we have reliable existence checks that can detect if the cdap resource no longer exists and thus will cause a recreation. Hope this helps! |
So consider the scenario where I have a Data Fusion in version 6.4.1 and I wish to re-deploy it as 6.5.0 version via Terraform. In Terraform, this implies simply changing the attribute
version
attribute. After making this change, when performing aterraform plan
the following error is obtained:| ERROR Get "/v3/namespaces" : unsupported protocol scheme""
To solve this, I am forced to remove all cdap ressources which produce the error using the
terraform state rm
command. Only after this can Terraform update the instance and re-create all the CDAP resources that have to be re-created due to the fact that the Data Fusion instance needs to be destroyed and recreated.I believe that this error comes from the fact that there is a dependency between a Data Fusion instance and all its CDAP resources that is not considered by the CDAP provider. One should never have to resort to editing the tfstate file to solve these types of issues. Is there a way to fix this?
The text was updated successfully, but these errors were encountered: