<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Use-Cases on</title><link>https://rosettadb.io/use-cases/</link><description>Recent content in Use-Cases on</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Wed, 30 Oct 2024 00:00:00 +0000</lastBuildDate><atom:link href="https://rosettadb.io/use-cases/index.xml" rel="self" type="application/rss+xml"/><item><title>Generating dbt Models using RosettaDB</title><link>https://rosettadb.io/use-cases/generating-dbt-models-using-rosettadb/</link><pubDate>Wed, 30 Oct 2024 00:00:00 +0000</pubDate><guid>https://rosettadb.io/use-cases/generating-dbt-models-using-rosettadb/</guid><description>&lt;img src="https://rosettadb.io/use-cases/generating-dbt-models-using-rosettadb/rosetta-dbt.png" alt="Featured image of post Generating dbt Models using RosettaDB" />&lt;p>&lt;strong>RosettaDB&lt;/strong> is a powerful tool for managing database schemas, enabling transformation and management of database objects across different databases. Combined with dbt (Data Build Tool), it allows you to generate dbt models directly from your database schema, creating structured, ready-to-use datasets for analysis. This guide demonstrates how to generate dbt models using RosettaDB from your database schema.&lt;/p>
&lt;h2 id="prerequisites">
&lt;a href="#prerequisites" class="header-anchor">#&lt;/a>
Prerequisites
&lt;/h2>&lt;p>Download JDBC drivers for your databases, install RosettaDB from the &lt;a class="link" href="https://github.com/rosettadb/rosetta/releases" target="_blank" rel="noopener"
>releases page&lt;/a>, and refer to the &lt;a class="link" href="https://github.com/rosettadb/rosetta#getting-started" target="_blank" rel="noopener"
>Quick Start Guide&lt;/a> for setup instructions.&lt;/p>
&lt;h2 id="setting-up-rosettadb">
&lt;a href="#setting-up-rosettadb" class="header-anchor">#&lt;/a>
Setting Up RosettaDB
&lt;/h2>&lt;h2 id="1-initialize-a-new-rosettadb-project">
&lt;a href="#1-initialize-a-new-rosettadb-project" class="header-anchor">#&lt;/a>
1. Initialize a New RosettaDB Project
&lt;/h2>&lt;p>To create a new project, use:&lt;/p>
&lt;pre tabindex="0">&lt;code>rosetta init dbt_postgres_project
&lt;/code>&lt;/pre>&lt;p>This will create a project directory with a &lt;code>main.conf&lt;/code> file for defining your database connections.&lt;/p>
&lt;h2 id="2-configure-database-connections">
&lt;a href="#2-configure-database-connections" class="header-anchor">#&lt;/a>
2. Configure Database Connections
&lt;/h2>&lt;p>Edit the &lt;code>main.conf&lt;/code> file to define your database connection. Here’s an example configuration for a general database connection:&lt;/p>
&lt;pre tabindex="0">&lt;code>connections:
- name: postgres_conn
databaseName: analysis_db
dbType: postgres
url: jdbc:postgresql://localhost:5432/analysis_db
userName: username
password: password
&lt;/code>&lt;/pre>&lt;h2 id="3-extract-dbml-models">
&lt;a href="#3-extract-dbml-models" class="header-anchor">#&lt;/a>
3. Extract DBML Models
&lt;/h2>&lt;p>Run the &lt;code>rosetta extract&lt;/code> command to generate DBML models from your database schema.&lt;/p>
&lt;pre tabindex="0">&lt;code>rosetta extract -s postgres_conn
&lt;/code>&lt;/pre>&lt;p>Now that you have the DBML models, you can proceed to generate dbt models.&lt;/p>
&lt;h2 id="generating-dbt-models">
&lt;a href="#generating-dbt-models" class="header-anchor">#&lt;/a>
Generating dbt Models
&lt;/h2>&lt;p>Use the &lt;code>rosetta dbt&lt;/code> command to convert your extracted DBML into dbt models:&lt;/p>
&lt;pre tabindex="0">&lt;code>rosetta dbt -s postgres_conn
&lt;/code>&lt;/pre>&lt;p>This command will produce dbt models based on the extracted schema, ready to integrate into your dbt project.&lt;/p>
&lt;h2 id="example-dbt-model-output">
&lt;a href="#example-dbt-model-output" class="header-anchor">#&lt;/a>
Example dbt Model Output
&lt;/h2>&lt;p>Here’s an example of what the generated dbt models might look like, covering multiple tables for better context.&lt;/p>
&lt;p>&lt;code>**model.yaml**&lt;/code>&lt;/p>
&lt;pre tabindex="0">&lt;code>version: 2
sources:
- name: Retail_Analysis
description: &amp;#34;Data source for retail analysis&amp;#34;
tables:
- name: sales_transactions
columns:
- name: transaction_id
tests:
- not_null
- unique
- name: product_id
tests:
- not_null
- name: customer_id
tests:
- not_null
- name: transaction_date
tests:
- not_null
- name: amount
tests: []
- name: products
columns:
- name: product_id
tests:
- not_null
- unique
- name: product_name
tests: []
- name: category
tests: []
- name: price
tests: []
- name: customers
columns:
- name: customer_id
tests:
- not_null
- unique
- name: first_name
tests: []
- name: last_name
tests: []
- name: email
tests:
- unique
- name: registration_date
tests: []
&lt;/code>&lt;/pre>&lt;p>&lt;strong>Example Models&lt;/strong>&lt;/p>
&lt;p>&lt;strong>1. Sales Transactions Model&lt;/strong>&lt;/p>
&lt;pre tabindex="0">&lt;code>with sales_transactions as (
select
transaction_id,
product_id,
customer_id,
transaction_date,
amount
from {{ source(&amp;#39;Retail_Analysis&amp;#39;, &amp;#39;sales_transactions&amp;#39;) }}
)
select * from sales_transactions
&lt;/code>&lt;/pre>&lt;p>&lt;strong>2. Products Model&lt;/strong>&lt;/p>
&lt;pre tabindex="0">&lt;code>with products as (
select
product_id,
product_name,
category,
price
from {{ source(&amp;#39;Retail_Analysis&amp;#39;, &amp;#39;products&amp;#39;) }}
)
select * from products
&lt;/code>&lt;/pre>&lt;p>&lt;strong>3. Customers Model&lt;/strong>&lt;/p>
&lt;pre tabindex="0">&lt;code>with customers as (
select
customer_id,
first_name,
last_name,
email,
registration_date
from {{ source(&amp;#39;Retail_Analysis&amp;#39;, &amp;#39;customers&amp;#39;) }}
)
select * from customers
&lt;/code>&lt;/pre>&lt;p>&lt;strong>Summary&lt;/strong>&lt;/p>
&lt;p>By following these steps, you can effectively generate dbt models from your PostgreSQL schema using RosettaDB. With these models, you can run transformations, apply tests, and document your data workflow, enhancing data quality and usability. This process not only streamlines your data management but also aligns with best practices in data analytics.&lt;/p>
&lt;p>For more details, check the official &lt;a class="link" href="https://docsrosetta.netlify.app/" target="_blank" rel="noopener"
>RosettaDB documentation &lt;/a> or reach out to the &lt;a class="link" href="https://join.slack.com/t/rosettadb/shared_invite/zt-1fq6ajsl3-h8FOI7oJX3T4eI1HjcpPbw" target="_blank" rel="noopener"
>community&lt;/a> if you need further assistance.&lt;/p></description></item><item><title>Migrate from PostgreSQL to MySQL</title><link>https://rosettadb.io/use-cases/migrate-from-postgresql-to-mysql/</link><pubDate>Sat, 20 May 2023 00:00:00 +0000</pubDate><guid>https://rosettadb.io/use-cases/migrate-from-postgresql-to-mysql/</guid><description>&lt;p>To migrate your PostgreSQL database to MySQL using Rosetta, you can follow these simple steps:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Install the required JDBC drivers for both PostgreSQL and MySQL databases.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Download and install Rosetta on your system.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Configure Rosetta to connect to your PostgreSQL and MySQL databases in a YAML config file. Here’s an example of how you can set up connections in the YAML config file:&lt;/p>
&lt;/li>
&lt;/ol>
&lt;pre tabindex="0">&lt;code>connections:
- name: postgres_prod
databaseName: mydatabase
schemaName: public
dbType: postgres
url: jdbc:postgresql://localhost:5432/mydatabase
userName: user
password: pass
- name: mysql_prod
databaseName: mydatabase
schemaName: myschema
dbType: mysql
url: jdbc:mysql://localhost:3306/mydatabase
userName: user
password: pass
&lt;/code>&lt;/pre>&lt;ol start="4">
&lt;li>Use Rosetta to generate DDL from your PostgreSQL database and transpile it to MySQL by running the following command:&lt;/li>
&lt;/ol>
&lt;pre tabindex="0">&lt;code>rosetta generate --source=postgres_prod --target=mysql_prod --output-dir=./mysql_ddl
&lt;/code>&lt;/pre>&lt;p>This will generate the MySQL DDL files in the &lt;code>./mysql_ddl&lt;/code> directory.&lt;/p>
&lt;ol start="5">
&lt;li>Execute the generated DDL files on your MySQL database to create the required tables, indexes, and other objects.&lt;/li>
&lt;/ol></description></item><item><title>Test Data Accuracy</title><link>https://rosettadb.io/use-cases/test-data-accuracy/</link><pubDate>Fri, 19 May 2023 00:00:00 +0000</pubDate><guid>https://rosettadb.io/use-cases/test-data-accuracy/</guid><description>&lt;p>To test the accuracy of your migrated data in RosettaDB, you can follow these general steps:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Define your expected results: Before performing any testing, define what the expected data results should be after migration.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Select a representative sample of data: Choose a representative sample of data from your source database that includes all types of data (e.g., text, numeric, date/time, etc.) and represents a typical use case.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Migrate the sample data: Use RosettaDB to migrate the selected sample data from your source database to the target database.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Verify the migrated data: Once the migration is complete, verify the migrated data in the target database against the expected results defined in step 1.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Perform additional testing: If necessary, perform additional testing on other parts of your data or with different sets of data.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Document your findings: Keep track of any issues or discrepancies found during testing, and document how they were resolved.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Repeat testing: After resolving any issues found during testing, repeat the testing process to ensure that the migration was successful and accurate.&lt;/p>
&lt;/li>
&lt;/ol></description></item><item><title>Generate Spark Python and Scala Data Transfer Code</title><link>https://rosettadb.io/use-cases/generate-spark-python-and-scala-data-transfer-code/</link><pubDate>Thu, 18 May 2023 00:00:00 +0000</pubDate><guid>https://rosettadb.io/use-cases/generate-spark-python-and-scala-data-transfer-code/</guid><description>&lt;p>To generate Spark Python and Scala data transfer code to RosettaDB, you can follow these steps:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Install the required JDBC drivers for your source and target databases.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Download and install Rosetta on your system.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Configure Rosetta to connect to your source and target databases. You can do this by updating the YAML config file with the connection details for each database.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Use Rosetta to generate DDL from your source database and transpile it to your desired target. You can do this by running the Rosetta CLI command with the appropriate arguments.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Once you have generated the DDL, you can use Spark Python or Scala to transfer the data between the source and target databases. You can do this by writing code that reads data from the source database using Spark SQL or DataFrame APIs and writes the data to the target database using JDBC.&lt;/p>
&lt;/li>
&lt;/ol></description></item><item><title>Generating DDL</title><link>https://rosettadb.io/use-cases/generate-ddl-rosettadb/</link><pubDate>Mon, 15 May 2023 00:00:00 +0000</pubDate><guid>https://rosettadb.io/use-cases/generate-ddl-rosettadb/</guid><description>&lt;p>To generate DDL in Rosetta, you can follow these steps:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Install the required JDBC drivers for your source and target databases.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Download and install Rosetta on your system.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Configure Rosetta to connect to your source and target databases. You can do this by updating the YAML config file with the connection details for each database.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Use the rosetta generate command to generate DDL from your source database. The syntax of the command is as follows:&lt;/p>
&lt;/li>
&lt;/ol>
&lt;pre tabindex="0">&lt;code>rosetta generate --source=&amp;lt;source_db_type&amp;gt; --target=&amp;lt;target_db_type&amp;gt;
&lt;/code>&lt;/pre>&lt;p>Replace &amp;lt;source_db_type&amp;gt; with the type of your source database (e.g., mysql, postgres, oracle, etc.), and replace &amp;lt;target_db_type&amp;gt; with the type of your target database.&lt;/p>
&lt;p>You can also specify the following optional parameters:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&amp;ndash;output=&amp;lt;output_file&amp;gt;: Specify the name of the output file where the generated DDL will be written. If not specified, the DDL will be written to stdout.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&amp;ndash;logging-level=&amp;lt;logging_level&amp;gt;: Set the logging level (debug, info, warn, or error). Default is info.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&amp;ndash;tables=&amp;lt;table_list&amp;gt;: Specify a list of tables to generate DDL for. If not specified, DDL will be generated for all tables in the source database.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&amp;ndash;schemas=&amp;lt;schema_list&amp;gt;: Specify a list of schemas to generate DDL for. If not specified, DDL will be generated for all schemas in the source database.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Here’s an example command to generate DDL for a MySQL source database and a Postgres target database:&lt;/p>
&lt;pre tabindex="0">&lt;code>rosetta generate --source=mysql --target=postgres \
--output=ddl.sql \
--tables=my_table_1,my_table_2 \
--schemas=my_schema_1,my_schema_2
&lt;/code>&lt;/pre>&lt;p>In this example, we’re generating DDL for two specific tables (my_table_1 and my_table_2) and two specific schemas (my_schema_1 and my_schema_2). The generated DDL will be written to a file called ddl.sql.&lt;/p>
&lt;ol start="5">
&lt;li>Once you have the generated DDL, you can execute it on your target database to create the necessary schema and tables. You can use any SQL client or tool to execute the DDL script.&lt;/li>
&lt;/ol>
&lt;p>Note that Rosetta generates declarative DBML models that can be used for conversion to alternate database targets. However, the generated DDL may require further modifications and optimizations to suit your specific use case and database configurations.&lt;/p></description></item></channel></rss>