This post was updated on .
Hi Erman,
our BI environment uses Oracle Golden Gate which will replicate tables from source environment on Solaris servers to it. The environment on BI is same as that of the source since we use downstream configuration for Golden Gate. The architecture uses SQL/PLSQL scripts to process data from the replicated tables. BIARCHITECTURE.jpg We have another environment which uses Big Data. It uses Streamet/talend to extract data from source and process the data instead of using SQL/PLSQL scripts as from above architecture for the transformation. BIARCHITECTURE2.jpg The replication tool for source is Streamset. Kindly advise which of the above will be the best architecture in terms of: 1. Costing 2. Reliability and any other metrics. Each architecture has some flaws and benefits when we compare them. Also in what ways can we integrate them? Thanks, Roshan |
Administrator
|
Hi Roshan,
I don't know anything about Streamet/talend. Can it perform CDC? Goldengate is reliable and it can do CDC, as you know. It is reliable and the best solution for making CDC on Oracle environments. PLSQL transformation is cheap but it should be managed well. It should be under control of some sort of application.. Else, you will do lots of maintanence. 2 diagrams that you sent are telling me different stories. In one of them, you use Hadoop and in the other you use Oracle RDBMS and RAC. I don't know your business story and purpose for building this kind of an environment, but you should decide the correct environment according to your needs. Using Oracle and Hadoop together is an alternative that we consider while dealing with enteprise Data Warehouses. We generally use Hadoop for offloading some of the processing and we still use Exadata (Oracle RAC) in the front line. Orchestration and integration between the platforms is also important. So, as for CDC, there are other solutions available on the market. One of the is STRIIM. We ( as GTech )are the distributor of it. So maybe you can take a look at it as well.. Of course, if you want to have a POC or consultancy, we are happy to help. We are already implemented this solution it lots of Enterprise Companies. Striim -> https://www.striim.com Also, you may want to check the Gluent. We are started to deali with it recently. Gluent -> https://gluent.com/ |
Hi Erman,
After GoldenGate extracts data from source and replicates to our BI environment, we use SQL/PLSQL scripts which are scheduled on crontab to process data using the tables replicated and load into a temporary table. The issue is as data grows(tables), the processing time wil increase and more resource will be needed. I would like to connect the hadoop with streamset to our BI platform to prevent us using SQL/PLSQL scripts. BI servers (on Solaris and RAC) --- streamset --- HADOOP. https://streamsets.com/ Second, you mentioned "PLSQL transformation is cheap but it should be managed well. It should be under control of some sort of application.. Else, you will do lots of maintanence." Please advise which sort of application should we use? Thanks, Roshan |
Administrator
|
This streamset seems someting like Big Data Sql. We have Big Data Sql option , you know that right?
Oracle Big Data Sql and Gluent are alternatives for this approach. The idea is correct. You offload some of the processing needs to cheap Big Data Platforms (like Hadoop) As for PLSQL related transforrmation, we generally use ODI. Lastly, crontab based scheduling is so primitive, youı should replace that in the first place. Actions should be event driven, One should start after one.. Or they should run parallel when necessary.. You can even write your own code to manage that transformaton, a code for orchestrating it.. |
This post was updated on .
Thanks Erman. What do you mean by actions should be event driven?
|
Administrator
|
I mean, there shouldn't be any gaps between oıne action and another (unless you want it to be so)
Some actions should be run parallel and then they even may be serialized again when necessary.. For instance, normally during DAta Load or transformation , we implement a flow like -> when event A finishes, it should trigger event B, so event B directly starts.. and so on.. However; when you use crontab (crond I mean), you may have gaps. |
In reply to this post by ErmanArslansOracleBlog
Hi Erman,
I would like to test Striim for replicating data from Oracle/Mongo to Hadoop services (Kudu, Mongo and Oracle). I have installed the demo version to test replication from GoldenGate trail files (on our datawarehouse) to Kudu (Hadoop). The issue is I cannot use the GG Trail option as source. Kindly advise. Capture.PNG Thanks, Roshan |
Administrator
|
That may not be enabled in the trial version, but we are checking.. I will get back to you.
If you are considering a Striim implementation, we can guide you.. We have a department that is responsible from Striim. So if you are considering it seriously, send me an email(erman.arslan@gtech.com.tr) from your company email, thus I can redirect you to the Striim consultants. |
Free forum by Nabble | Edit this page |