Initial Deployment

Last updated: 2023-11-07

CDK Bootstrap

The account needs to be bootstrapped in all required regions with the following command:

cdk bootstrap aws://<account-id>/<region>

CDK initial deployment

To set up the initial infrastructure, you need to run the cdk code from cdk_init. You can do this with the prepared action from the Makefile. This action will create the following resources: - OIDC provider and IAM roles (so that github actions can create all other resources) - Secret Values - apgw/jwt_psk in all regions. This parameter is used to sign the jwt tokens for the api gateway. - datadog/credentials in all regions - make sure json is correct. This is needed to add lambda functions traces and metrics to datadog. - rds/contentdb. This contains the access credentials for the content database. - rds/xmodb. This contains the access credentials for the xmo database. - Parameter Store params - /openai/api_key. This contains the openai api key.

All secrets and parameters are prepared but need to be filled with the real secret values after creation. Never fill the cdk script with the credentials, always update the secrets after creation in the aws console.

Further you have to run the DatadogIntegration stack manually to set up the datadog integration.

First connection to the database

Log in to the ec2 bastion host via aws console and session manager.
- This needs to be done to really start the ec2 instance, otherwise the database will not be accessible.
Take the script from ./scripts/db_connect.shand input the right ec2 instnace id, database url und preferred ports.
Run the script with the correct aws credentials set in the environment variables.
Get the database credentials from secretsmanager and connect via localhost

Create Index on the database

If you have created a new DB in the process (and not used a snapshot) then you will want to create an index on the embeddings after filling it with some data. Therefore run the following commands:

Increase the work memory for the index creation:

set maintenance_work_mem = 4194304

Create the index:

CREATE index ON embeddings
USING ivfflat (embedding_vector vector_cosine_ops)
WITH (lists = 1800);

Set the number of probes for the search queries:

SET ivfflat.probes = 43;

To see the status of the index creation:

SELECT
  now()::TIME(0),  
  a.query,  
  p.phase,  
  round(p.blocks_done / p.blocks_total::numeric * 100, 2) AS "% done",  
  p.blocks_total,  
  p.blocks_done,  
  p.tuples_total,  
  p.tuples_done,  
  ai.schemaname,  
  ai.relname,  
  ai.indexrelname
FROM pg_stat_progress_create_index p
JOIN pg_stat_activity a ON p.pid = a.pid
LEFT JOIN pg_stat_all_indexes ai on ai.relid = p.relid AND ai.indexrelid = p.index_relid;

Importing existing resources to CDK

If you have created resources manually in the aws console, you can import them to the cdk stack. Make sure, that the cdk stack is up to date before importing. Then add the resources in the corresponding stack and run cdk synth. This will create the file 'cdk.out/.template.json'. Make sure, that the CDKMetadata contains the same value as for the last complete deploy run. If the values differ you have to update them manually.

With the template.json file you can import the resources to the cdk stack by going to the aws console > cloudformation. Select your stack and click on import resources. Then upload the template.json file and follow the instructions.

Last updated: March 31, 2026 at 13:19

By: Max Friedrich

📄 View source

Repository: PIT-GPT-Research/StatistaGPT