Releases: awslabs/aws-serverless-data-lake-framework
Serverless Data Lake Framework 2.0.0-rc.4
Work is ongoing on a new major version of the Serverless Data Lake Framework. This is a pre-release, not ready for production workloads.
What's Changed
Read Serverless Data Lake Framework 2.0.0-beta.0 to know more about what's new in SDLF 2.0.0!
- refactor: remove usage of boto3 resource by @cnfait in #228
- move the -{env} suffix from layers and transforms too by @cnfait in #229
- add logs permissions to glue crawler role by @cnfait in #230
- remove everything that we do not officially support at launch by @cnfait in 1afb677
- remove role name for legislators glue job role, update path by @cnfait in 13a36d2
- sdlf-monitoring: remove ELK stack by @cnfait in 0970b53
Full Changelog: 2.0.0-rc.3...2.0.0-rc.4
Thanks
We thank all the contributors/users for their work on this release.
Serverless Data Lake Framework 2.0.0-rc.3
Work is ongoing on a new major version of the Serverless Data Lake Framework. This is a pre-release, not ready for production workloads.
What's Changed
Read Serverless Data Lake Framework 2.0.0-beta.0 to know more about what's new in SDLF 2.0.0!
- access control using lakeformation only by @cnfait in #219
- add job in static-checking github action: black by @gwenika in #220
- add job in static-checking github action: ruff by @gwenika in #221
- add job in static-checking github action: shellcheck by @gwenika in #222
- fix all static checking findings (black, ruff, shellcheck, cfn-lint) by @cnfait in #223
- Fix all cfn_nag findings by @cnfait in #225
- remove the -{env} suffix from files in team repository by @cnfait in #226
- add workshop examples to sdlf-utils by @cnfait in #227
Full Changelog: 2.0.0-rc.2...2.0.0-rc.3
Thanks
We thank all the contributors/users for their work on this release.
Serverless Data Lake Framework 2.0.0-rc.2
Work is ongoing on a new major version of the Serverless Data Lake Framework. This is a pre-release, not ready for production workloads.
What's Changed
Read Serverless Data Lake Framework 2.0.0-beta.0 to know more about what's new in SDLF 2.0.0!
- get size and last modified metadata from s3 in a single request by @cnfait in #215
- Crossaccount roles simplification by @cnfait in #216
- add batch capability to updating object metadata in datalakeLibrary by @cnfait in #217
- fix single account workshop/demo setup by @cnfait in #218
Full Changelog: 2.0.0-rc.1...2.0.0-rc.2
Serverless Data Lake Framework 2.0.0-rc.1
Work is ongoing on a new major version of the Serverless Data Lake Framework. This is a pre-release, not ready for production workloads.
What's Changed
Read Serverless Data Lake Framework 2.0.0-beta.0 to know more about what's new in SDLF 2.0.0!
- create an athena workgroup per team by @gwenika in #208
- add optional glue job deployer feature by @cnfait in #209
- Optional VPC support for SDLF by @cnfait in #211
- trigger rMain pipelines on sdlf-cicd repository change by @cnfait in 4fa2a4a
- sdlf-team: log groups for datasets/pipelines-dynamodb lambda functions by @cnfait in #212
- restore min/max_items_process feature at the pipeline level by @cnfait in #213
- provide vpc connection to glue jobs in glue-job-deployer, add disable-proxy-v2 to default arguments by @cnfait in #214
Full Changelog: 2.0.0-rc.0...2.0.0-rc.1
Thanks
We thank all the contributors/users for their work on this release, in particular @gwenika.
Serverless Data Lake Framework 2.0.0-rc.0
Work is ongoing on a new major version of the Serverless Data Lake Framework. This is a pre-release, not ready for production workloads.
What's Changed
Read Serverless Data Lake Framework 2.0.0-beta.0 to know more about what's new in SDLF 2.0.0!
- automatically create codepipeline infrastructure for new stages by @cnfait in #190
- StageA state machine update by @cnfait in #191
- Python 3.11 as default for CodeBuild runtimes. Also update Codebuild image to latest AmazonLinux2 by @Mreddy19 in #192
- Python 3.11 as default for Lambda functions by @Mreddy19 in #193
- StageB state machine update by @cnfait in #194
- Enforce the use of aws-cli >= 2.13.0 by @Mreddy19 in #196
- fix sdlf-stageB glue arguments from dynamodb by @cnfait in #198
- Configurable lambda stageA is now configurable through dynamodb by @cnfait in #199
- StageB: configurable glue job name through dynamodb by @cnfait in #200
- deploy datalakelibrary using cloudformation by @cnfait in #201
- use 'main' as the default branch name instead of 'master' by @cnfait in #202
- Use KMS for base encryption of S3 buckets by @cnfait in #203
- Build CloudFormation modules as part of sdlf-main/sdlf-main-* repository pipelines by @cnfait in #204
- update GlueVersion from 2.0 to 4.0 by @cnfait in #205
Full Changelog: 2.0.0-beta.4...2.0.0-rc.0
Thanks
We thank all the contributors/users for their work on this release, in particular @Mreddy19.
Serverless Data Lake Framework 2.0.0-beta.4
Work is ongoing on a new major version of the Serverless Data Lake Framework. This is a pre-release, not ready for production workloads.
What's Changed
Read Serverless Data Lake Framework 2.0.0-beta.0 to know more about what's new in SDLF 2.0.0!
- Team-specific IAM role for pipelines and datasets creation by @cnfait in #183
- sdlf-pipeline cloudformation short form syntax by @cnfait in #184
- Codebuild awscli version by @mousamm in #185
- sdlf-cicd refactoring by @cnfait in #186
- Fix BuildLambdaLayers (previously known as sdlf-pipLibrary) by @cnfait in #187
- use sdlf-main-{domain}-{team} naming scheme instead of sdlf-{domain}-{team}-main by @cnfait in #188
- restore StageA+StageB codepipeline by @cnfait in #189
Full Changelog: 2.0.0-beta.3...2.0.0-beta.4
Thanks
We thank all the contributors/users for their work on this release, in particular @mousamm.
Serverless Data Lake Framework 2.0.0-beta.3
Work is ongoing on a new major version of the Serverless Data Lake Framework. This is a pre-release, not ready for production workloads.
What's Changed
Read Serverless Data Lake Framework 2.0.0-beta.0 to know more about what's new in SDLF 2.0.0!
- sdlf-foundations: use s3 eventbridge support by @cnfait in #180
- Fix cfn_nag issues in sdlf-foundations by @cnfait in #181
- store the aws-sam-cli zip in s3 when setting up sdlf by @cnfait in #182
- sdlf-monitoring: migrate from Elasticsearch 7.10 to OpenSearch 2.5 by @tomaszwrzonski in #167
Full Changelog: 2.0.0-beta.2...2.0.0-beta.3
Thanks
We thank all the contributors/users for their work on this release, in particular @tomaszwrzonski.
Serverless Data Lake Framework 2.0.0-beta.2
Work is ongoing on a new major version of the Serverless Data Lake Framework. This is a pre-release, not ready for production workloads.
What's Changed
Read Serverless Data Lake Framework 2.0.0-beta.0 to know more about what's new in SDLF 2.0.0!
- Explicit CloudFormation removal policy by @cnfait in #163
- Fix sdlf-monitoring template issues by @cnfait in #164
- allow tagging infrastructure with stack tags by @cnfait in #169
- IAM permissions cleanup by @cnfait in #170
- migration to EventBridge Scheduler by @cnfait in #171
- CloudFormationManagedUploadInfrastructure permissions update by @cnfait in #172
- move the states execution role from sdlf-team to sdlf-stage* by @cnfait in #173
- remove old test infrastructure by @cnfait in #174
- sdlf-team: fix cfn_nag issues by @cnfait in #175
- replace jaidisido with cnfait in github issue templates by @cnfait in #176
- disable cfn_nag W76 SPCM too high by @cnfait in #177
- sdlf-stage*: fix cfn_nag issues by @cnfait in #178
Full Changelog: 2.0.0-beta.1...2.0.0-beta.2
Serverless Data Lake Framework 2.0.0-beta.1
Work is ongoing on a new major version of the Serverless Data Lake Framework. This is a pre-release, not ready for production workloads.
What's Changed
Read Serverless Data Lake Framework 2.0.0-beta.0 to know more about what's new in SDLF 2.0.0!
- CloudFormationManagedUploadInfrastructure now requires s3:PutBucketPu… by @tomaszwrzonski in #143
- Build CloudFormation module version only when needed by @cnfait in #160
- Improve our cfn-lint posture by @cnfait in #162
New Contributors
- @tomaszwrzonski made their first contribution in #143
Full Changelog: 2.0.0-beta.0...2.0.0-beta.1
Serverless Data Lake Framework 2.0.0-beta.0
Work is ongoing on a new major version of the Serverless Data Lake Framework. This is a pre-release, not ready for production workloads.
What’s New
- SDLF components are now CloudFormation modules
- there is one module per component: foundations, team, pipeline, stageA, stageB, dataset.
- datalakeLibrary and pipLibrary are used to build Lambda layers, they’re not CloudFormation modules.
deploy.sh
takes care of deploying the CICD infrastructure used to build these modules, and register them in the private CloudFormation registry of each account. Modules are updated whenever there is a change to their source repository.
- SDLF CICD pipelines now live in the Shared DevOps account
- CloudFormation stacks are created in child accounts through crossaccount IAM roles.
- SDLF can deploy an arbitrary number of child accounts driven from a single devops account.
pDomain
(which defaults todatalake
) can be provided when deploying foundations.- each domain can have the usual three environments (
dev
,test
,prod
).
- Deploying foundations and teams is now done from a new repository called
sdlf-main
.- this repository is created during the initial setup with deploy.sh.
- foundations deployment happens in
foundations-{domain}-{env}.yaml
and teams inteams-{domain}-{env}.yaml
. - sdlf-main works the same way everything works in SDLF -
master
,test
anddev
branches are expected. - it is easier to know which teams have been created, and to remove them as they don’t share the same set of parameters in
parameters-{env}.json
.
- Deploying pipelines and datasets is now done from a new repository called
sdlf-{domain}-{team name}-main
.- this repository is created when a new team is created.
- pipelines deployment happens in
pipelines-{env}.yaml
and datasets indatasets-{env}.yaml
. - sdlf-{team name}-main works the same way everything works in SDLF -
master
,test
anddev
branches are expected. - it is easier to know which pipelines and teams have been created, and to remove them as they don’t share the same set of parameters in
parameters-{env}.json
.
- Mappings between datasets and transforms in stageB is done directly when defining a dataset.
- this mapping used to be done by a CodeBuild project and a script in
sdlf-datalakeLibrary
. They are no longer needed and have been removed. - it is now defined through the
pPipelineDetails
parameter when defining a dataset insdlf-dataset
. This parameter goes even further and can be used to store more information that stages can use. These details are stored in the Datasets DynamoDB table (as was already the case in SDLFv1).
- this mapping used to be done by a CodeBuild project and a script in
- Stages in a pipeline are now driven by EventBridge rules exclusively.
- the rule can be an event pattern or a schedule (cron expression).
- stageA is no longer sending messages to a queue for stageB to process. StageB is configured with an event pattern to listen for stageA runs (
pEventPattern
in the example), and then process these events on a schedule (pSchedule
) - it is easier now to have pipelines with a single stage, pipelines with dependent stages and overall more complex pipelines than in SDLFv1, as long as there is an event pattern to listen for.
- New optional component:
sdlf-monitoring
, with CloudTrail, ELK and SNS.- in SDLFv1 Cloudtrail is optional but enabled by default. Here it is optional and not enabled as long as
sdlf-monitoring
is not deployed.
- in SDLFv1 Cloudtrail is optional but enabled by default. Here it is optional and not enabled as long as
- New optional stage:
sdlf-stage-dataquality
- deequ is now entirely optional. While it wasn’t enabled by default in SDLFv1, dedicated infrastructure was still created while deploying sdlf-foundations. This is no longer the case.
sdlf-stage-dataquality
can now be used as an example on how to add a third stage to the default stageA and stageB pipeline.
- Outside the initial
deploy.sh
, there is no more shell scripts.
Full Changelog: 1.5.2...2.0.0-beta.0