Saturday, 25 June 2011

Miscellaneous standards


1.         All ports to be in CAPITAL LETTERS for flat files.
2.         Description of all the transformations (including sources and targets) is to 
            be filled appropriately.
3.        There should be no unused ports in any transformations.
4.        Connection information of lookup should be $SOURCE or $TARGET.
5.         For output only ports, all error (Transformation error) messages must be removed.
6.         When Strings are read from a fixed width flat file source and compared, 
        ensure that the strings are trimmed properly (LTRIM, RTRIM) for blankspaces.
7.       Where ever the fields returned from a lookup transformation are used, null 
          values for these fields should be handled accordingly.
8.      When ports are dragged to the successive transformations, the names should 
          not be changed.
9.       Common sources and target tables are to be used from ‘COMMON’ folder.

10. By default “edit null charaters to space” option in session definition for file targets has to be selected.
11. Check the return status of pmcmd. If non-zero then exit.
12. ALL CONTROL REPORTS TO BE IN CAPS - Please modify the necessary scripts.  RECORDS PROCESSED: 45  RECORDS REJECTED: 0

Note there is a ":" separating the description and the value in the above example.This should be standard for all control reports.

13. ALL CONTROL REPORTS TO BE IN CAPS
14. The Default values of all the output ports should be removed.
15. All the components in the mapping should have comments.
16.All Ports in should be used and connected.

17. The Tracing level value should be set to 'Terse' in all transformations across the mapping.

18. All the 'Error (Transformation error)' messages should be removed from default value in the Expression transformation.

19.The Mapping should not contain any unused Source/Target Definitions.

20. The Override Tracing should be set to 'TERSE' in the Session properties of Error handling.

21. The 'Enable High Precision' should be disabled in the Session properties of Performance.

22. The Stop on error property should be greater than 0. There maybe exceptions to the rule all such cases should be documented in the design document

23. The Session log file directory, Cache directory, File name, database connection stings, reject file directory, target connection string should start with a $.

24. The port names should be as per Standards.

25. The Data type of ports in source qualifier and source should be matching.

26. The sequence of fields in the sql query of a source qualifier should be in the same order as that of the order of the ports in it.

27. The filter expression should be coded properly in the following format: IIF (condition, True, False).

28. The lookup transformation should not be having fields, which are neither checked as out port nor used as lookup only port.

29.Usage of variables  Variable expressions can reference any input port. So as a matter of good design principle, in a transform, variables should be defined after all the input ports have been defined.  Variables that will be used by other variable(s) should be the first to be defined.  Variables can reference other variable(s) in an expression. NOTE: the ordering of variables in a transform is very important. All the variables using other variables in their expression should be defined in the order of the dependency. In the table above, Month is used to calculate Q1_sales. Hence, Q1_sales must always be defined after Month.

Variables are initialized to 0 for numeric variables or empty string for character/date variables.

Variables can be used for temporary storage (for example, PREVIOUS_STATE is a temporary storage variable which will be overwritten when a new data record is read and processed).

Local variables also reduce the overhead of recalculations. If a complex calculation is to be used throughout a mapping then it is recommended to write the expression once and designate it as a variable. The variable can then be passed to other transforms as an input port. This will increase performance, as the Informatica Server will perform the calculation only once. For example, the variable Q1_sales can be passed from one transformation to another rather than redefining the same formula in different transforms.

Variables can remember values across rows and they retain their value until the next evaluation of the variable expression. Hence, the order of the variables can be used to compute procedural computations. For example, let V1 and V2 be two variables defined with the following expressions and V1 occurs before V2: V1 has the expression ‘V2+1’ V2 has the expression ‘V1+1’ In this case, V1 gets the previous rows’ value for V2 since V2 occurs after V1 and is evaluated after V1. But V2 gets the current rows’ value of V1 since V1 is already evaluated by the time V2 is evaluated.


Informatica can perform data sorting based on a key defined for a source but it would be beneficial to have an index (matching the sort criteria) on a table for better performance. This is especially helpful while performing key based aggregations as well as lookups.
31.An external procedure/function call should always return a value. The return value will be used to determine the next logical step in the mapping execution.
32.Performance of an aggregator can be improved by presorting the data before passing it to the aggregator.
33.Filter data as early as possible in the ETL
34.Use custom SQL in Source Qualifier for the WHERE clause to filter data. This lets the database do the work, where it can be handled the most efficiently
35.A stored procedure should always return a value. The return value will be used by Informatica to evaluate the success or failure of the event. It can also be used to control the outcome of the mapping execution.
36.Stored procedures used inside of a mapping usually create poor performance. Therefore, it is recommended to avoid using stored procedures inside of a mapping.
37. Mapping a string to an integer, or an integer to a string will perform the conversion, however it will be slower than creating an output port with an expression like: to_integer(xxxx) and mapping an integer to an integer. It's because PMServer is left to decide if the conversion can be done mid-stream which seems to slow things down

No comments:

Post a Comment