Skip to main content
Question

Best Practices for Organizing & Naming Data Pools in OCPM (Source, Combined, and Derived Pools)

  • November 4, 2025
  • 2 replies
  • 37 views

shalin.lalla
Level 2
Forum|alt.badge.img

Hi Community,

I’m working on cleaning up our data pools in OCPM, which currently include:

  • Migrated CPM → OCPM pools
  • New OCPM pools
  • Pools that extract data directly from source systems
  • Pools that combine multiple source extractions
  • Derived data pools (with filters, perspectives, etc.)

What is a good way to organize or separate these different types of data pools? Specifically:

  • Is there a recommended approach to distinguish source-based pools from combined or derived pools?
  • Are there best practices for naming conventions to make it easier to manage and identify them?
  • How can I easily see which data pools are connected to which assets (apps, views, packages)?

Any guidance or examples would be greatly appreciated!

Thanks in advance,
Shalin

I am aware of this question, but it was posted a while ago and does not provide the granularity I am looking for🙁

https://community.celonis.com/topic/show?tid=4263&fid=2

 

2 replies

manuel.wetze
Level 9
Forum|alt.badge.img+8
  • Level 9
  • November 7, 2025

Love the question!
I don’t want to following to be understood as best practice here just sharing what I find useful for myself. Happy to hear contradicting opinions.

The challenge I have to you is whether you have too many OCPM datapools if you have to maintain them by a naming convention. I can see scenarios where a handfull of them is needed but generally I think it only should be 1 OCPM Pool (and then multiple perspectives = Datamodels in it). But maybe I am just lacking a big scale operation :D 

To the question to distinguish source-based pools from combined or derived pools:
I am personally a big fan of having a single central “Parentpool” which is connected to all sources imagineable. It handles extractions and generic transformation which are relevant to all downstream data usage. So I think this is what you were referring to “source-based” + “combined” .This one is literally called “Parentpool”.
So from this one I am sharing Views/Tables to certain “childpools” one of them is called OCPMPool (as I only have one). But also to other Case centric datapools. Those are then named after the shared Usecase.


axel.bühle12
Level 4
Forum|alt.badge.img

You could look at data warehouse layers for guidance:

We have one extraction pool - representing the source layer (see below) - we do not have a staging layer (but might be a good idea for pre-processing some data across source systems) - our various productive pools (OCPM or CPM) would be the domain specific modeling layer. Presentation layer would be knowledge model / studio.

1. Source Layer
The source layer is the foundational layer that stores data in multiple databases and operational systems. You cannot usually access your data directly from here since it is a storage unit for further processing. Some common sources are NoSQL DBs, APIs, relational DBs, etc.

2. Staging Layer
The staging layer temporarily holds your collected data to clean, validate, and transform it. Different ETL tools come together to carry out data preparation tasks.

3. Modeling Layer
Structured data is stored here in an organized manner for analytical querying and reporting. Star schemas and Snowflake schemas are common data models for this layer.

4. Presentation Layer
Also known as the consumption layer, it offers a user-friendly interface for easy data access and analysis. You can use visualization tools, such as reports and dashboards, to interact with this information.