Archive

Archive for the ‘BI Testing’ Category

BI Testing-SQL Performance tuning

March 29, 2012 Leave a comment

Introduction:

Generally ETL performance testing is confirmation test to ensure that an ETL ‘system’ can handle the load of multiple users and transaction.  For any project this is primarily ensuring that the ‘system’ can easily manage the throughput of millions of transactions.

You can improve your application performance by optimizing the queries you use. The following sections outline techniques you can use to optimize query performance.

Improve Indexes:

  • Creating useful indexes is one of the most important ways to achieve better query performance. Useful indexes help you find data with fewer disk I/O operations and less system resource usage.
  • To create useful indexes, you must understand how the data is used, the types of queries and the frequencies they run, and how the query processor can use indexes to find your data quickly.
  • When you choose what indexes to create, examine your critical queries, the performance of which will affect the user’s experience most. Create indexes to specifically aid these queries. After adding an index, rerun the query to see if performance is improved. If it is not, remove the index.
  • As with most performance optimization techniques, there are tradeoffs. For example, with more indexes, SELECT queries will potentially run faster. However, DML (INSERT, UPDATE, and DELETE) operations will slow down significantly because more indexes must be maintained with each operation. Therefore, if your queries are mostly SELECT statements, more indexes can be helpful. If your application performs many DML operations, you should be conservative with the number of indexes you create.

Choose what to Index:

  • We recommend that you always create indexes on primary keys. It is frequently useful to also create indexes on foreign keys. This is because primary keys and foreign keys are frequently used to join tables. Indexes on these keys let the optimizer consider more efficient index join algorithms. If your query joins tables by using other columns, it is frequently helpful to create indexes on those columns for the same reason.
  • When primary key and foreign key constraints are created, SQL Server Compact 3.5 automatically creates indexes for them and takes advantage of them when optimizing queries. Remember to keep primary keys and foreign keys small. Joins run faster this way.

Use Indexes with Filter Clauses

  • Indexes can be used to speed up the evaluation of certain types of filter clauses. Although all filter clauses reduce the final result set of a query, some can also help reduce the amount of data that must be scanned.
  • A search argument (SARG) limits a search because it specifies an exact match, a range of values, or a conjunction of two or more items joined by AND. It has one of the following forms:

Understand Response Time Vs Total Time:

  • Response time is the time it takes for a query to return the first record. Total time is the time it takes for the query to return all records. For an interactive application, response time is important because it is the perceived time for the user to receive visual affirmation that a query is being processed. For a batch application, total time reflects the overall throughput. You have to determine what the performance criteria are for your application and queries, and then design accordingly.

Example:

  • Suppose the query returns 100 records and is used to populate a list with the first five records. In this case, you are not concerned with how long it takes to return all 100 records. Instead, you want the query to return the first few records quickly, so that you can populate the list.
  • Many query operations can be performed without having to store intermediate results. These operations are said to be pipelined. Examples of pipelined operations are projections, selections, and joins. Queries implemented with these operations can return results immediately. Other operations, such as SORT and GROUP-BY, require using all their input before returning results to their parent operations. These operations are said to require materialization. Queries implemented with these operations typically have an initial delay because of materialization. After this initial delay, they typically return records very quickly.
  • Queries with response time requirements should avoid materialization. For example, using an index to implement ORDER-BY, yields better response time than using sorting. The following section describes this in more detail.

Index the ORDER-BY / GROUP-BY / DISTINCT Columns for Better Response Time

  • The ORDER-BY, GROUP-BY, and DISTINCT operations are all types of sorting. The SQL Server Compact 3.5 query processor implements sorting in two ways. If records are already sorted by an index, the processor needs to use only the index. Otherwise, the processor has to use a temporary work table to sort the records first. Such preliminary sorting can cause significant initial delays on devices with lower power CPUs and limited memory, and should be avoided if response time is important.
  • In the context of multiple-column indexes, for ORDER-BY or GROUP-BY to consider a particular index, the ORDER-BY or GROUP-BY columns must match the prefix set of index columns with the exact order. For example, the index CREATE INDEX Emp_Name ON Employees (“Last Name” ASC, “First Name” ASC) can help optimize the following queries:
    • .. ORDER BY / GROUP BY “Last Name” …
    • … ORDER BY / GROUP BY “Last Name”, “First Name” …

It will not help optimize:

  • … ORDER BY / GROUP BY “First Name” …
  • … ORDER BY / GROUP BY “First Name”, “Last Name” …

For a DISTINCT operation to consider a multiple-column index, the projection list must match all index columns, although they do not have to be in the exact order. The previous index can help optimize the following queries:

  • … DISTINCT “Last Name”, “First Name” …
  • … DISTINCT “First Name”, “Last Name” …

It will not help optimize:

  • … DISTINCT “First Name” …
  • … DISTINCT “Last Name” …

Rewrite Subqueries to Use JOIN

Sometimes you can rewrite a subquery to use JOIN and achieve better performance. The advantage of creating a JOIN is that you can evaluate tables in a different order from that defined by the query. The advantage of using a subquery is that it is frequently not necessary to scan all rows from the subquery to evaluate the subquery expression. For example, an EXISTS subquery can return TRUE upon seeing the first qualifying row.

Example:

To determine all the orders that have at least one item with a 25 percent discount or more, you can use the following EXISTS subquery:

SELECT “Order ID” FROM Orders O

WHERE EXISTS (SELECT “Order ID”

FROM “Order Details” OD

WHERE O.”Order ID” = OD.”Order ID”

AND Discount >= 0.50)

You can rewrite this by using JOIN:

SELECT DISTINCT O.”Order ID” FROM Orders O INNER JOIN “Order Details”

OD ON O.”Order ID” = OD.”Order ID” WHERE Discount >= 0.50

Limit Using Outer JOINs

OUTER JOINs are treated differently from INNER JOINs in the optimizer. It does not try to rearrange the join order of OUTER JOIN tables as it does to INNER JOIN tables. The outer table (the left table in LEFT OUTER JOIN and the right table in RIGHT OUTER JOIN) is accessed first, followed by the inner table. This fixed join order could lead to execution plans that are less than optimal.

Use Parameterized Queries:

  • If your application runs a series of queries that are only different in some constants, you can improve performance by using a parameterized query. For example, to return orders by different customers, you can run the following query:
  • SELECT “Customer ID” FROM Orders WHERE “Order ID” = ?
  • Parameterized queries yield better performance by compiling the query only once and executing the compiled plan multiple times. Programmatically, you must hold on to the command object that contains the cached query plan. Destroying the previous command object and creating a new one destroys the cached plan. This requires the query to be re-compiled. If you must run several parameterized queries in interleaved manner, you can create several command objects, each caching the execution plan for a parameterized query. This way, you effectively avoid re-compilations for all of them.

17 Tips for Avoiding Problematic Queries

1. Avoid Cartesian products
2. Avoid full table scans on large tables
3. Use SQL standards and conventions to reduce parsing
4. Lack of indexes on columns contained in the WHERE clause
5. Avoid joining too many tables
6. Monitor V$SESSION_LONGOPS to detect long running operations
7. Use hints as appropriate
8. Use the SHARED_CURSOR parameter
9. Use the Rule-based optimizer if I is better than the Cost-based optimizer
10. Avoid unnecessary sorting
11. Monitor index browning (due to deletions; rebuild as necessary)
12. Use compound indexes with care (Do not repeat columns)
13. Monitor query statistics
14. Use different tablespaces for tables and indexes (as a general rule; this is old-school somewhat, but the main point is reduce I/O contention)
15. Use table partitioning (and local indexes) when appropriate (partitioning is an extra cost feature)
16. Use literals in the WHERE clause (use bind variables)
17. Keep statistics up to date

Conclusion

ETL projects today are designed for correct functionality and adequate performance, i.e., to complete within a time window. However, the task of optimizing ETL designs is left to the experience and intuition of the ETL designers. In addition, ETL designs face additional objectives beyond performance.

Read More about BI Testing

Advertisements

Automation Tool Selection Recommendation

March 15, 2012 Leave a comment
  • Overview
  • Information Gathering
  • Tools and Vendors
  • Evaluation Criteria
  • Tools Evaluation
  • Matrix
  • Conclusion
  • Overview

Automated Testing” means automating the manual testing process currently in use. This requires that a formalized “manual testing process” currently exists in the company or organization. Minimally, such a process includes:

–        Detailed test cases, including predictable “expected results”, which have been developed from Business Functional Specifications and Design documentation.

–        A standalone Test Environment, including a Test Database that is restorable to a known constant, such that the test cases are able to be repeated each time there are modifications made to the application.

Information Gathering

Following are sample questions asked to tester who have been using some the testing tools:

How long have you been using this tool and are you basically happy with it?

How many copies/licenses do you have and what hardware and software platforms are you using?

How did you evaluate and decide on this tool and which other tools did you consider before purchasing this tool?

How does the tool perform and are there any bottlenecks?

What is your impression of the vendor (commercial professionalism, on-going level of support, documentation and training)?

Tools and Vendors

  • Robot – Rational Software
  • WinRunner 7 – Mercury
  • QA Run 4.7 – Compuware
  • Visual Test – Rational Software
  • Silk Test – Segue
  • QA Wizard – Seapine Software

Tools Overview

Robot – Rational Software

–        IBM Rational Robot v2003 automates regression, functional and configuration testing for e-commerce, client/server and ERP applications. It’s used to test applications constructed in a wide variety of IDEs and languages, and ships with IBM Rational TestManager. Rational TestManager provides desktop management of all testing activities for all types of testing.

WinRunner 7 – Mercury

–        Mercury WinRunner is a powerful tool for enterprise wide functional and regression testing.

–        WinRunner captures, verifies, and replays user interactions automatically to identify defects and ensure that business processes work flawlessly upon deployment and remain reliable.

–        WinRunner allows you to reduce testing time by automating repetitive tasks and optimize testing efforts by covering diverse environments with a single testing tool.

QA Run 4.7 – Compuware

–        With QA Run, programmers get the automation capabilities they need to quickly and productively create and execute test scripts, verify tests and analyze test results.

–        Uses an object-oriented approach to automate test script generation, which can significantly increase the accuracy of testing in the time you have available.

Visual Test 6.5 – Rational Software

–        Based on the BASIC language and used to simulate user actions on a User Interface.

–        Is a powerful language providing support for pointers, remote procedure calls, working with advanced data types such as linked lists, open-ended hash tables, callback functions, and much more.

–        Is a host of utilities for querying an application to determine how to access it with Visual Test, screen capture/comparison, script executor, and scenario recorder.

Silk Test – Segue

–        Is an automated tool for testing the functionality of enterprise applications in any environment.

–        Designed for ease of use, Silk Test includes a host of productivity-boosting features that let both novice and expert users create functional tests quickly, execute them automatically and analyze results accurately.

–        In addition to validating the full functionality of an application prior to its initial release, users can easily evaluate the impact of new enhancements on existing functionality by simply reusing existing test casts.

QA Wizard – Seapine Software

–        Completely automates the functional regression testing of your applications and Web sites.

–        It’s an intelligent object-based solution that provides data-driven testing support for multiple data sources.

–        Uses scripting language that includes all of the features of a modern structured language, including flow control, subroutines, constants, conditionals, variables, assignment statements, functions, and more.

Evaluation Criteria

Record and Playback         Object Mapping

Web Testing Object              Identity Tool

Environment Support        Extensible Language

Cost                                            Integration

Ease of Use                             Image Testing

Database Tests                     Test/Error Recovery

Data Functions                    Object Tests

Support

3 = Basic  2 = Good  1 = Excellent

Tool Selection Recommendation

Tool evaluation and selection is a project in its own right.

It can take between 2 and 6 weeks. It will need team members, a budget, goals and timescales.

There will also be people issues i.e. “politics”.

Start by looking at your current situation

– Identify your problems

– Explore alternative solutions

– Realistic expectations from tool solutions

– Are you ready for tools?

Make a business case for the tool

–What are your current and future manual testing costs?

–What are initial and future automated testing costs?

–What return will you get on investment and when?

Identify candidate tools

– Identify constraints (economic, environmental, commercial, quality, political)

– Classify tool features into mandatory & desirable

– Evaluate features by asking questions to tool vendors

– Investigate tool experience by asking questions to other tool users Plan and schedule in-house demonstration by vendors

– Make the decision

Choose a test tool that best fits the testing requirements of your organization or company.

An “Automated Testing Handbook” is available from the Software Testing Institute (www.ondaweb.com/sti), which covers all of the major considerations involved in choosing the right test tool for your purposes.

Read More about Automation Tool 

Marketing Automation in B2B – Separating the Wheat from the Chaff

The B2B landscape in its inherent form is a complex jar of beans primarily because the initial connection needs lots of nurturing with the right mix of appropriate communication to ensure the “best weather” for sales interaction. Marketers not only have to measure outcomes right up to revenue but also find the “sweet spot” for marketing and sales to drum up the right notes.

The year 2009 and the first half of 2010 saw a marked shift towards marketing automation worldwide. All of this has helped channelize information and reach out to prospects better, yet it is pertinent to note the today’s internet savvy prospect is also armed with qualifying information about your brand, your products and your competitors as never before. To get inside the mind of the B2B buyer, marketers not only need to understand his intent from his digital body language but also ensure that automated lead generation processes in place scale up in terms of the following pertinent factors at any point of time.

  • Are lead recycling programs in place for not-sales-ready leads?
  • Has social media, inbound marketing and marketing automation been integrated seamlessly?
  • Is your marketing communication supported by buyer-centric collaterals that help the buyer decide in your favor?
  • Has your data been data washed and scrubbed clean?
  • Does your web metrics provide actionable information for user profiling and conversion?
  • Are sales and marketing on the same latitude to proving your prospect the best buying experience?
  • Does your Social media spin influence the markets conversation about your brand effectively?
  • Is your opt-in list getting fresh brew in the form of persuasive communication and supporting newsletter value?
  • Is your marketing funnel measurable and process definitions flexible to innovation?
  • How effective is your conversation model, does it ensure that you are at top of mind when prospects decide to bite the bait?
  • Are you able to capitalize on marketing automation’s great benefit – reporting effectively and use it as a strategic tool?
  • Does your data-centric marketing plans lead the way for greater customer intelligence since value of data will not be a constant ?

The above are just a few important cogs that can make or break your lead generation wheel. As marketers brace themselves to capitalize on marketing automation to enhance pipeline opportunities, trends all point to an explosive growth in marketing automation adoption. It is highly imperative that automation vendors provide more sophisticated reporting, better sales engagement processes and social media integration.

To help marketing efficiently separate the wheat from the chaff, marketing automation should not just serve as a driver of operational efficiency but more importantly enhance continuity of dialogue with prospects throughout the decision making/buying cycle at all relevant touch points.

Ultimately it is all about the harvest – the pipeline and revenue, the executive leadership would not mind how you do it.

Read More about Marketing Automation in B2B