Before you can start querying Amazon DocumentDB events in the S3 bucket, complete the following steps to create a AWS Glue crawler to crawl S3 bucket.
On the AWS Glue menu, select Crawlers. Choose Add crawler.
Enter crawler-change-streams as the crawler name for initial data load. Optionally, enter a description. Choose Next.
Choose Data stores, choose Crawl all folders, and choose Next.
On the Add a data store section, make the following selections:
On the Add another data store section, select No and choose Next
On the Choose an IAM role section, make the following selections:
On the Create a schedule for this crawler section, for Frequency select Run on demand and choose Next.
On the Configure the crawler’s output section, choose Add database to create a new database for our Glue Catalogue.
Enter change-streams as your database name, leave everything else as is, and choose Create. choose Next
Review the summary page noting the Include path (Data Stores section) and Database (Output section). Choose Finish. The crawler is now ready to run.
Select the crawler-change-streams crawler and choose the Run crawler button.
The crawler will change status from Starting to Stopping. Wait until the crawler status changes to Ready (the process will take a few minutes). You can see that it has added 1 table.
You will see the data inside the s3 bucket.