Replacing a Legacy System: Part-1

Legacy applications are like “First Love”, really hard to let-go! Whilst, it’s an up-hill task to replace them, there are multifarious reasons why replacing a legacy application is beneficial. Here are few reasons that are critical and deciding factors for replacing legacy applications.

  • Cost or Cost-Effectiveness

This is a prime factor for replacing legacy systems. Major chunk of money goes into maintaining the applications and without debate, any organization would have to do so; but is it worth maintaining an app with almost obsolete technology. If I were a CFO, I shall always ask “What ROI does supporting my current app bring vs replacing it with new one?”. Whilst, replacement cost of the legacy application is lower than maintaining it, hence it’s a desirable action in this direction.

  • Integration Challenges

Legacy systems do not always integrate well with the latest technology systems and integration-pain via custom-written apps is more than replacing the entire system. In the era of IOT (and ubiquitous computing) where all devices communicate with each-other and integration is seamless, it makes more sense of adapting the new technology and replacing the old ones.

  • Productivity

Productivity is another enabler in replacement of legacy systems. As basic definition of business changes from product-centric to customer centric, use-case have become more complex. Legacy systems have been doing the heavy lifting (amending and sustaining) at the price of productivity (cost and time). New technology and apps have configurable business-rule-engine, that enhances productivity and makes the process of future adaption more sublime.

  • Laid-back Decision Making

Decision Making has been a challenge while using legacy application because, accessing data and churning out meaningful insights (data analytics) require a separate decision making system. Data from the legacy has to be transformed and refreshed into the decision making system coupled with reporting solution for presentation. Business stake-holders always took “reactive” decisions based on the historic trends (or incidents in terms of security). With the advent of bleeding-edge technology and in-memory data processing, stake-holders are in a position to take pro-active decisions (also mitigate timely risks in terms of security)

Now that you understand WHY legacy applications should be replaced; Stay tuned for the next blog where, I shall try to list down important technical challenges that we overcome for smooth replacement.

Advertisements

A go-live Saved!!!

An unfixed issue will always haunt you on the day of go-live. Today was one of that day when customer started complaining about serious performance issues post go-live. Eventually, a small configuration change saved the day and in the end customer was happy!!!!

Here is what customer complained:

  • Application is very slow and there is lot of clocking.
  • Application user has to wait for as long as a minute after pressing a button to fetch data.
  • CPU utilization was constantly high, never came down below 90% post deployment

Our observations:

  • Found more than 1000 queries taking consuming high CPU time (average worker time)
  • There were more than 500 queries with average execution time of more than 10 seconds
  • Maximum Degree of parallelism was “0”
  • Cost Threshold of Parallelism was “5”
  • SQL Server Box has single NUMA node with 8 logical processors

Analysis and Correction Steps

  • Max Degree of Parallelism: As per guidelines for max degree of parallelism, SQL Server with single NUMA node with 8 logical processors should have MAX DOP setting of 8 or less. Default setting of zero should always be changed based on the CPU cores available to the SQL Server.
  • Cost Threshold of Parallelism: This setting was also set to default, as per recommendations by experts this setting should be set between 25-50. But one has to always test and find the a number that is not too high or not too low. A low value say 5 (default), means lot of queries whose cost is greater than 5 will be chosen to execute in parallel. There are queries that does not require parallel execution, but they are forced to go parallel and hence execution time shoots up. The opposite is true when the value is high. A candidate query that should execute in parallel will execute as a single thread with high execution time. We recommended the pre-tested value of 30 measured for an equivalent workload.
  • High CPU: CPU utilization was high as almost all the queries went for an implicit conversion from NVARCHAR to VARCHAR. Then happens when a parameterized query declares default NVARCHAR(4000) for a string parameter from application. The underlying column in the database is of type VARCHAR, hence query goes thru implicit conversion. This behaviour is called parameter sniffing, where SQL Server complies the execution plan sniffing the parameters from the input query and use that plan whenever the query is executed. Apparently, implicit conversion makes the existing indexes unusable and query goes for a Clustered Index Scan or a full table scan (for Heap). This also shoots up the execution time of the query. This fix to this problem is to let SQL Server know the datatype of parameters. In our case, we know that there is no NVARCHAR column in the database so a small change in jdbc url solved the problem (sendStringParametersAsUnicode=false).In the end, it was a day accomplished (with happy customer) and go-live saved 🙂

Application Tuning vs Database Tuning

Recently, there was a debate when application developers and database developers were in tussle and pointing fingers on each other for a pity performance issue 🙂

Here are few pointers over the debate and little initial development:

  • Application Developer: There is poorly performing query when being executed from the application. Database developer’s needs to analyse and fix it.
  • Database Developer: The query when executed from database works fine and provides relevant results quickly.
  • Database Administrator: Query pulled from the plan cache – goes for a full table scan and parameterized. There are indexes defined on the predicates, but they are not getting picked-up!

This is what developers saw when executing the query from management studio.

DeveloperPlan

This is the execution plan generated when query is executed from application.

ApplicationPlan

Clearly, looking at highlighted operators in the plan, we can figure out that query executed by application goes for a Clustered Index Scan as opposed to Clustered Index Seek.

Why this difference?

It boils down to the parameterization datatypes!!!!

Application developer passed one parameter of “String” type in the filter predicate, which had a column of type VARCHAR. And SQL Server created a parameter of type NVARCHAR by default for any “String” literal. Hence, to match data type on both side of an equivalence operator, the column with VARCHAR datatype is implicitly converted to NVARCHAR. If we hover over the clustered index scan operator, we will see CONVERT_IMPLICIT being used to perform the conversion task (see highlighted below).

ApplicationScan

On the other hand, when the same query is executed from management studio, query uses Index Seek as no implicit conversion is required.

DeveloperSeek

Possible application fix: Application developers should describe parameter types which gets passed in the query. In NET APIs parameters can be described by using SqlParameter class, that contains not only for parameter name and value, but also for parameter data type, length, precision, and scale.

 

References:

How data access code impacts database performance
SqlParameter Class

SQL Server Security Basics | The Principle of Least Privilege

We need to walk a fine line when granting privileges to a user to perform certain task. Principle of Least Privilege says that a user be granted only those privileges, which are required to perform their task – nothing more, nothing less. This principle is also known as the principle of minimal privilege or the principle of least authority.

We should always follow the principle of least privilege when granting permissions to database users. A database user, whose task is to extract reports, should be granted read permissions to the schema. Image if the same user has write permissions to the database, potentially database can be compromised; sales data can be fudged to make it look great before the management if a user has extra privileges.

There is no direct way or a charter, which lists set of permissions vis-à-vis functional task. It varies from organization to organization. In order to come up with a security model, start by assigning least level of privileges (that may seem appropriate to you) to perform the intended task and then test. As a quick tip try to remove one privilege at a time and check the impact on the task under consideration. This way you will be able to formulate a security model.

I came across an incident recently where a team received a database backup (was in Oracle though not SQL Server) from a customer to fix an issue and a user account that was used to restore the backup actually overwrite password profiles that were set-up internally. It was discovered very late when some of the application users reported passwords being expired after one week of creation.

Security is very important for an organization to sustain for long, and there is no alternative but to conscientiously follow principle of least privilege.

 

References:

SQL Server Security Basics | What is Authorization?

Think of this:

  • Whether a stranger is authorized to enter your house without your permission?
  • Whether you authorize your friend to take your car whenever he wants to go on a drive?
  • Whether a co-employee is authorized to access your confidential information stored with Human Resource department?

You might be getting a hang of it… Authorization is all about “What can a person (or identity in digital world) do?”; have they so called “access rights/privileges” to the desired “resources”.

That being said, Authorization takes a form of access policies that an organization sets forth for the resources being used. These access policies are created and/or controlled by an authority (usually a senior employee or department head).  These policies are formulated based on “principle of least privilege” – which says that a user/identity should only have minimum set of privileges to get their work done.

In SQL Server, Authorization is enforced with Permissions, and we have a freedom to club common permissions into Roles. These permissions are hierarchical in nature and exist both at database and server level.

I will talk more about Authorizations, Permission Hierarchy, and principle of least privilege in upcoming blogs. So stay tuned!

SQL Server Security Basics | What is Authentication?

By definition Authentication means the process of verifying the identity of a user or process. If a user wants to talk to the database, SQL Server asks “Who you are?”, and authenticates you. There are three types of authentication modes available:

  • Windows Authentication
  • SQL Server Authentication
  • Azure Active Directory

Windows Authentication

  • This is the default authentication mode and the more secure as compared to SQL Server Authentication.
  • Microsoft BOL recommends to use Windows Authentication over SQL Server Authentication.
  • This mode is available both on SQL Server running on-premises and Azure Virtual Machine.
  • It uses Kerberos security protocol.
  • Connection made under this mode is also called “Trusted Connection” as SQL Server trusts Windows credentials.
  • Has additional password policy, such as strong password validation, support for account lock and password expiration.

SQL Server Authentication

  • Logins here are validated which are created and managed by SQL Server.
  • Unlike Windows Authentication, user should provide credentials every time while connecting to SQL Server.
  • There are few (optional) password policies also available.
  • This mode can be used where there is a requirement to support mixed operating system for applications and users cannot be validated using Windows domain.
  • Can be useful with web-applications where users have the provision to create their own identities.
  • It does not use Kerberos security protocol, and there is also a risk for applications that connect automatically with the SQL Server may save the password in file in clear text.

Azure Active Directory

  • This authentication mechanism validates an identity based on Azure Active Directory (Azure AD).
  • It supports token-based authentication, ADFS (domain federation) and/or built-in vanilla authentication without domain synchronization.
  • It can also support mechanism of password rotation in a single place.
  • Allows management of identities centrally (Central ID), which helps in simplifying user and permission management.

I am going to write a series of blogs on security basics, this one is first one in the row. Stay tuned.

References:

Choosing Authentication Mode

Azure AD Authentication