![]() ![]() We refer to these activities as monitoring. In addition, data must be gathered to measure how well the processes are executed. For example, there must be some form of notification to confirm whether automated execution has taken place. Additional measures must be taken to allow system administrators to quickly verify and, if necessary, diagnose and repair the data integration process. Scheduling is just one aspect of running ETL tasks in a production environment. Every complex Data Warehouse system has a requirement to run those jobs at a certain fixed schedule as most jobs are dependent on each other. So once the jobs are deployed on production servers, they need to be run as per schedule or manually. As the job runs on the PDI framework it is running on a local machine that is intended to fail due to physical or network problems.Īnother thing is that a real-time production level job requires lots of computational power and network bandwidth which local systems are unable to provide and we need to host these jobs on production servers satisfying all the computational requirements. ![]() While dealing with real-time Pentaho ETL jobs consisting of a large number of jobs and transformations and lots of parallel and sequential job processing involved, running this kind of job from the PDI framework locally is not a reliable method as it is an error prone. If we are having job holding couple of transformations and not very complex requirement it can be run manually with the help of PDI framework itself. Once we have developed the Pentaho ETL job to perform certain objective as per the business requirement suggested, it needs to be run in order to populate fact tables or business reports. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |