SAP ASE HADR Overview – Part3

In this multi-part post, explaining the basics behind how SAP Replication Server works when replicating from a SAP ASE database to a SAP ASE database as part of an HADR (ASE always-on) setup for a SAP Business Suite system.

The post is based on SAP ASE (Adaptive Server Enterprise) 16.0 HADR with SAP Replication Server (SRS) 16.0.

In Part 1 we started with:

  • What is SRS.
  • The basic premise of HADR with SRS.
  • What a transaction is.

In Part 2 we went on to discuss:

  • What is the ASE transaction log.
  • Which databases are replicated.
  • How do transactions move to the SRS.

In this part we discuss the role of the active SRS and how the internal processing moves the transactions into the secondary ASE database.

What is the Active SRS?

In a standard HADR setup with primary and secondary databases, there are two instances of SRS.

  • The inactive SRS is running on the primary database node.
  • The active SRS is running on the secondary database node.

The active SRS receives the transactions (commands) from the Replication Agents in the primary databases.

To those with DB2 or Oracle experience, this replication hop seems strange at first. On closer inspection it achieves the same desired result, the transactional data is successfully persisted on a server separate to the primary database..

The inactive SRS is unused until a failover occurs.
During a failover the inactive SRS, on the old primary server, can switch replication paths to become the active SRS. Therefore the inactive SRS is the reverse path of replication.

What are the Key Parts of SRS?

In my view, there are 5 key parts to the SRS 16.0 architecture.
At each of these stages the data is persisted.

  1. The primary database.
    Sends transactions from the transaction logs, via the Replication Agent threads to the active SRS.
  2. The SRS Simple Persistent Queue (SPQ).
    This is a simple set of disk files for persisting the unordered transactions received on the SRS.
    In synchronous replication mode, once a replicated transaction is persisted in the SPQ, it is classified as “received” on the secondary which allows the transaction to commit on the primary database.
    A backlog on the SPQ could mean the underlying disk is not fast enough for the replication workload, or that the server hosting the actives SRS is suffering CPU/IO saturation.
    If you have anti-virus installed, you should treat the SPQ disk just like you treat other database data files (i.e. exclude them from the A/V scanning).
  3. The SRS Stable Inbound Queue (IBQ).
    An ordered table of the replicated transaction commands that represent the open or final transactions.
    In the case of transactions that are rolled back, they are removed at the IBQ once the rollback command is seen (needs to come via the SPQ).
    There is one IBQ for each primary database and it only ever holds transactions from a database and never from another Rep Server.
    The SRS internals process the transactions on the IBQ that have a corresponding commit record (needs to come via the SPQ).
    These committed transactions are grouped/compacted, ordered and translated into the correct target language for the target database platform and moved to the respective outbound queue for the target databases.
    A backlog on the IBQ could mean the SRS internals may not be keeping up with the replication workload.
    It is important to re-state that transactions on the IBQ could be open and waiting for a rollback or commit, which means the SPQ needs space for the transactions that contain those commands to make it through to the IBQ.
  4. The SRS Stable Outbound Queue (OBQ).
    Committed transactions are moved from the IBQ onto the the OBQ.
    The OBQ actually shares a portion of the same partition space as the IBQ, so moving between the IBQ and OBQ is very quick.
    There is one OBQ for each target database and one if the target is another SRS (in scenarios with a third DR node).
  5. The Target (a.k.a Standby or Secondary or Companion) Databases.
    The SRS has a set of distribution threads (DIST) that apply the transactions from the OBQs to the respective target databases via the DSI (Data Server Interface).
    NOTE: In my diagrams I’ve positioned the DSI as slightly separate, but it is actually a module/component of the DSI.

    For scenarios with a DR node also, the target is the DR Rep Server.
    In the target databases you will the <SID>_maint user is used to apply the transactions.
    A backlog on the OBQ could indicate a problem with the performance of the target database.

Replication Synchronisation Modes

With ASE HADR there are 3 replication modes available. The different modes affect how transactions are treated.

In asynchronous replication mode (a.k.a “Warm Standby” or “DR”), the Replication Agent threads operate in a lazy way. Scanning for and sending transactions to the active SRS in batches when they can.

There is a replication delay in asynchronous mode, but it does mean that the responsiveness of the primary database may seem better to the client applications (SAP Business Suite) because the commit to the primary database does not wait for the Replication Agent to send the transaction to the SRS.
Because of the inherent delay, there is a high possibility of data-loss in a failover scenario.
Even in asynchronous mode, the transaction log of the primary database cannot be freed until transactions are actually committed to secondary (the STP cannot be moved until transactions are committed on secondary).

When HADR is configured in synchronous replication mode (a.k.a “Hot Standby” or “HA”), each transaction is immediately sent by the replication agent to the SRS for persisting on disk at the SPQ.
Once safely in the SPQ, the primary database is allowed to commit the transaction.

This means synchronous replication mode has a direct impact on the responsiveness of the primary database transactions, because the commit on primary will be delayed by the Replication Agent + network transfer + memory + I/O latency for persisting to the SPQ of the SRS.

Lastly, near-synchronous mode works the same way as synchronous, but it is designed for slower disks hosting the SPQ on the SRS. This means that the SRS acknowledgement is sent back to the Replication Agent as soon as transaction is received in memory, but before it is persisted to the SPQ files on disk.

Compared to synchronous mode, near-sync has a slightly higher possibility of data-loss, in exchange for a slight reduction in latency.

Open Transaction Handling

As mentioned previously, transactions are replicated from the primary almost as soon as they are started (as soon as the transaction log records the start of the transaction). This means “open” transactions are replicated as soon as they are started and flow all the way to the IBQ.

(See Part 1 for a description of “Open Transactions”)

If transactions were only replicated once they were committed on primary, there would be a large delay before that transaction was safely applied to the secondary database.
When enabled, the “early dispatch” feature (“parallel_dist”) even allows large open transactions (using specific SRS threads) to start applying to the secondary database early, once the commit record is seen at the SPQ.

What are the Implications of Replication Delay?

There are two main implications of delay (latency) in replication:

  1. Potential Data Loss.
    Any delay in the replication from primary to SRS introduces potential data-loss.
    This is because the aim of replication is to get the transactional data off the primary databases as soon as possible and safely onto the active SRS node (server) in another location. Any delay between the active SRS and the secondary database could mean data-loss but only if the SRS software itself is lost or corrupted.
    The ultimate goal is to get from the primary database to the active SRS as quickly as possible, with the second goal being to get from the primary database to the secondary database as quickly as possible.
    This opens up possibilities regarding architecture design and sizing of the secondary databases, which I will cover in another post.
  2. Response Time.
    Any delay in synchronous replication mode can also delay the commit of the transactions on the primary databases. In a Netweaver ABAP stack, this delay would be seen in the SAP Business Suite as a longer database response time in the Netweaver UPDATE work processes, therefore the delay is unlikely to be seen by the users themselves. In a Java stack, there is no such UPDATE work process, so the Java stack is likely to be more sensitive to longer response times.

In the next part (part 4 is here), we will step through the replication of a single transaction and discuss the impact of different scenarios along the way.

Source pagehttps://www.it-implementor.co.uk/2021/02/sap-ase-hadr-overview-part3.html

SAP ASE HADR Overview – Part2

In this multi-part post, explaining the basics behind how SAP Replication Server works when replicating from a SAP ASE database to a SAP ASE database as part of an HADR (ASE always-on) setup for a SAP Business Suite system.

The post is based on SAP ASE (Adaptive Server Enterprise) 16.0 HADR with SAP Replication Server (SRS) 16.0.

A recap of what we discussed in Part 1:

  • What is SRS.
  • The basic premise of HADR with SRS.
  • What a transaction is.

In this part we discuss the ASE transaction log and how important it is for the working of the database then we discuss how transactions are replicated to the active SRS and touch on the internals of the SRS.

What is the ASE Transaction Log?

The ASE transaction log is the most active part of an ASE database.
When we talk about database ACID principles, the transaction log is what helps deliver those things. It is simply a log of transactions applied to a database.
The transaction log is where changes to the database are first made, it is the register, used to ensure that the database can be recovered to a consistent point-in-time.
Some of the seven ASE databases hosted in an ASE 16.0 instance do not need this point-in-time recovery capability, so they have a special setting that simply truncates the records of the transaction log frequently. The setting/option is called “trunc log on chkpt” and should *never* be applied to the SID database (the BusSite database) in production.
The SID database holds our vital business data, and needs to be recoverable right up to the last transaction that had a final state (committed or rolled-back) applied on that database.

On a regular interval, data changes contained in the transaction log that have a final state, are applied to the database data devices (the database main persistence).
This process is known as a checkpoint.
In those ASE databases with option “trunc log on chkpt” turned on, a checkpoint causes the used transaction log entries to be simply truncated (cleared), with no dump required.
In ASE databases with “trunc log on chkpt” turned off (like the SID database), after a checkpoint has occurred, the used transaction log segments are marked as freeable, and these freeable segments can become available for re-use only once a transaction log backup (dump) is performed to a backup disk or device.

This is a cycle that allows the database to continue to be used for making changes to the database data while also protecting the register of transactions in case of a database crash or even complete corruption.

If the transaction log ever becomes full, then the database is unable to allow new changes and no new transactions can be started until space is made available in the transaction log. It is therefore very important for regular database transaction logs dumps to be performed.
Not only to free space in the transaction log, but also to protect the database in the event that the transaction log is lost or corrupted. For added protection, these transaction dump file(s) can also be moved off to a separate location, away from the database server itself. This is a function of almost every backup tool and as you will find a requirement of almost every database that abides by the ACID principles.

Which Databases are Replicated?

Each ASE instance hosts multiple ASE databases.
In a Business Suite system, the ASE 16.0 instance hosts 7 databases (plus 2 temporary databases and a model database):

  • Master
  • Sybsystem
  • Sybmgmtdb
  • Sybsecurity
  • Sybsystemprocs
  • Saptools
  • SID

In a HADR system, we usually only replicate the Master database and the SID database. This is because the master database holds vital information about the user logins, roles and privileges, which are needed to be the same in the target system. The SID database is the database for the SAP Business Suite system, it holds our business data, which we need to protect.

How do Transactions Move to the SRS?

Inside the replicated primary ASE databases, there exists a Replication Agent process that runs in its own process thread.
The Replication Agent scans the transaction log of the database it serves and sends the transactions across to the active SAP Replication Server (SRS).
In the active SRS a matching receiver thread (Rep Agent User component thread) receives the transactions and pushes them into the Simple Persistent Queue (SPQ) on the disk of the active SRS server.

Subsequent components inside the active SRS process the transactions from the SPQ and eventually push them into the secondary database.

Source pagehttps://www.it-implementor.co.uk/2021/01/sap-ase-hadr-overview-part2.html

Part 3 is now available here.

SAP ASE HADR Overview – Part1

In this multi-part post, going to explain the basics behind how SAP Replication Server works when replicating from a SAP ASE database to a SAP ASE database as part of an HADR (ASE always-on) setup for a SAP Business Suite system.

The post is based on SAP ASE (Adaptive Server Enterprise) 16.0 HADR with SAP Replication Server (SRS) 16.0.

What is SAP Replication Server (SRS)?

Replication Server was originally a Sybase product, ingested by SAP when it bought Sybase.
SAP Replication Server (SRS) is not database specific, it can support a number of other source and target database systems such as SAP ASE, SAP HANA, Oracle and SQLAnywhere. Because of this heterogeneous database support, SRS is quite a complex product, offering a multitude of replication scenarios.
In fact, the SRS product is the underpinning to a number of other SAP products such as SAP Landscape Transformation (SLT) and the Near-Zero Downtime (NZDT) option for database migrations.

SRS can be used to provide HA and/or DR for databases, but it can also be used to produce active-active setups with multi-regional replicas of databases for improved local access times.
This is somewhat of an exceptional case and most definitely rare with SAP Business Suite applications.

In this post, we will be using the simple example of just a primary and a secondary (companion) database.
This is known as either SAP ASE HADR or ASE always-on and is the most common use case. In SRS language our secondary database is referred to as the “companion” database.

With SRS it is also possible to have a three tiered architecture, primary, secondary and tertiary. In this three tier setup the secondary is known as the companion and the disaster recovery (DR) database is known as the “DR node” or tertiary database. You would usually use a three tier HADR architecture setup if you want HA in a primary datacentre or cloud region and also a separate DR in a secondary datacentre or cloud region.

There are multiple replication options: synchronous, near-synchronous or asynchronous, depending on your latency between source and target databases, your required RPO and also depending on your required use of HADR.

SRS is the recommended option for SAP ASE database replication.

What is the Basic Premise of HADR with SRS?

For SAP systems with SAP ASE HADR (always-on), the SRS provides replication of “transactions” from the source database(s) to secondary (companion) and/or DR database(s).

In a SAP landscape, the SAP system is configured to fail-over its connection to the companion in the event of a database failover. No cluster is needed for the database network connectivity, because the ASE database driver (dbsl) is “HA aware”.

Inside the primary database a process called the Replication Agent is responsible for sending the transactions to the Replication Server on the companion database and it is configured in Stream Replication mode. This is the only supported mode in HADR.
NOTE: Stream Replication is also known as “ci” (Component Interface) throughout the SRS administration manuals. There are many occasions where you will need to know this information.

In “ci” mode, the SRS proprietary language Log Transfer Language (LTL) is not used.
With “ci” mode, there are three possible synchronisation modes: synchronous, near-synchronous or asynchronous.

With SRS, the primary and secondary databases are NOT the same database (regarding layout, size, blocks), unlike HANA System Replication.
They are their own databases requiring all the usual care and attention that would be applied to the primary. Such as frequent transaction log backups, health checks etc.

With SRS enabled, database transactions that are started on the primary database are replicated to the secondary while still in the open state.

What is a Transaction?

A transaction is an ATOMIC unit of work with a beginning and an end, with work performed in between.


Each transaction can have one of two final states. It is either committed (saved) to the database, or it can be rolled-back (undone).

A transaction that is not yet in a final state, is called an “open” transaction. Any transaction in the “open” state, is still in progress and occupies space in the database transaction (tran) log. Used space in the tran log cannot be used by other transactions.
In a HADR system, the oldest open transaction executing on the primary database, is usually what is responsible for the position of the Secondary Truncation Point (STP) in the primary database.
The STP is a marker point placed into the primary database transaction log by the Replication Agent and is used to determine the current commit point in the companion database (i.e. it shows the latest transaction that is not yet committed on the companion database).

That’s it for part 1.
In part 2 we will go into the internals of SRS and how transactions are replicated through it.


Source page : https://www.it-implementor.co.uk/2021/01/sap-ase-hadr-overview-part1.html

Sybase interview questions

      Ã˜ What is sybase?

Ø Explain ASE architecture?

Ø What are the system roles ?

Ø What are the default databases ?

Ø Threads inside Sybase?

Ø what is time slice error and how to resolve timeslice -201 ?

Ø what is fill factor?

Ø Types of locks in Sybase?

Ø Row locking mechanism in ASE?

Ø what is the use of data cache,procedure cache,statement cache?

Ø Truncate only and no log difference?

Ø Dirty buffers?

Ø Reorg types ?

Ø Types of update statistics and how to check the last stats of the table?

Ø How to clear tempdb space ?

Ø How to separate tempdb from master ?

Ø How to check long running queries?

Ø What is the use of sybsystemdb?

Ø Clustered index and non clustered index ?

Ø Table partitions and its use ?

Ø What is the threshold actions ?

Ø what is the use of sysmon report?

Ø what is optdiag report ?

Ø How to run upgrade?

Ø What is zombie/ghost process?

Ø What is RAID in sybase ?

Ø Different types of replication and how to configure and maintain high availability?


Refer the below links for more :

https://testmydailyworks.wordpress.com/2012/11/28/sybase-all-in-one/

https://dbasdiary.com/home/important-15-sybase-ase-dba-interview-questions-and-answers/

https://www.bestinterviewquestion.com/sybase-interview-questions

https://www.wisdomjobs.com/e-university/sybase-interview-questions.html

http://dba.fyicenter.com/Interview-Questions/SYBASE/

https://www.globalguideline.com/interview_questions/Questions.php?sc=sybase_database

https://www.careerride.com/sybase-interview-questions.aspx


How to check the long running transactions in sybase

 Step 1:- Log in the dataserever using isql

isql -Usa -S<SID> -X -w6000

Step 2:- Check the long running trans as below

If there is any row in syslogshold, see the start time column and current time,

If there is big diff, it means you have log runningg trans in dataserver and you need to esclate to APP team.

for particular db holds

select * from syslogshold where dbid=db_id("<DB_NAME>")

go

select db_id("<DB_NAME>")

go

EXAMPLE :

1> select * from syslogshold

2> go

dbid   reserved    spid   page        xactid         masterxactid   starttime

        name                                                                xloid

------ ----------- ------ ----------- -------------- -------------- --------------------------

        ------------------------------------------------------------------- -----------

     4           0    627    43859290 0x029d3d5a0005 0x000000000000        Jun 12 2011  9:34AM

        $REORG REBUILD dmASSET_HIST ID=621764747                                   1254

     2           0    606      332908 0x0005146c000a 0x000000000000        Jun 12 2011  3:02AM

        $ins                                                                        628

(2 rows affected)

1> select getdate()

2> go

 --------------------------

        Jun 12 2011  4:00PM

(1 row affected)

Step 3:- Now we need to find the detail of blocking spid as below

1> select * from sysprocesses where spid=606

2> go

spid   kpid        enginenum   status       suid        hostname   program_name     hostprocess cmd              cpu

        physical_io memusage    blocked dbid   uid         gid

        tran_name                                                        time_blocked network_pktsz fid

        execlass                       priority   affinity                       id          stmtnum     linenum     origsuid

        block_xloid clientname                     clienthostname                 clientapplname                 sys_id

        ses_id      loggedindatetime           ipaddr

 ------ ----------- ----------- ------------ ----------- ---------- ---------------- ----------- ---------------- -----------

        ----------- ----------- ------- ------ ----------- -----------

        ---------------------------------------------------------------- ------------ ------------- ------

        ------------------------------ ---------- ------------------------------ ----------- ----------- ----------- -----------

        ----------- ------------------------------ ------------------------------ ------------------------------ ------

        ----------- -------------------------- ---------------

   606   283443444           4 lock sleep           546 nyggmgrotc RESETREP         23822       INSERT                     3

                  0          22     627      4         405       16390

        $user_transaction                                                       15143           512      0

        BS4                            HIGH       ANYENGINE                        637764804           5         439        NULL

                  0                                                                                                   0

                  0        Jun 12 2011 11:24AM 10.152.115.100

(1 row affected)

1> sp_who "606"

2> go

fid    spid   status       loginame                       origname                       hostname   blk_spid

        dbname                         cmd              block_xloid

------ ------ ------------ ------------------------------ ------------------------------ ---------- --------

        ------------------------------ ---------------- -----------

     0    606 lock sleep   lcprusr                        lcprusr                        nyggmgrotc 627

        tempdb                       INSERT                     0

(1 row affected)

(return status = 0)

1> dbcc traceon(3604)

2> go

NOTE: 3604 is used to send output back to the client instead of writing to error log

 DBCC execution completed. If DBCC printed error messages, contact a user with System Administrator (SA) role.

1> dbcc sqltext(606)

2> go

SQL Text: 320110714CONV_AVG

AUTO      CARRY

DBCC execution completed. If DBCC printed error messages, contact a user with System Administrator (SA) role.

1> sp_showplan 606,null,null,null

2> go

 QUERY PLAN FOR STATEMENT 5 (at line 439).

   STEP 1

  The type of query is INSERT.

  The update mode is direct.

  TO TABLE dmASSET_HIST Using I/O Size 2 Kbytes for data pages.

(return status = 0)

1> sp_lock 606

2> go

 The class column will display the cursor name for locks associated with a cursor for the current user and the cursor id for other

 users.

fid    spid   loid        locktype                     table_id    page        row    dbname

        class                          context

------ ------ ----------- ---------------------------- ----------- ----------- ------ ---------------

        ------------------------------ ----------------------------

     0    606        1686 Ex_intent                      589764633           0      0 tempdb

        Non Cursor Lock

     0    606        1686 Ex_row                         589764633      199169      0 tempdb

        Non Cursor Lock

     0    606        1686 Ex_intent                      635720842           0      0 tempdb

        Non Cursor Lock

     0    606        1686 Ex_row                         635720842    22607248      0 tempdb

        Non Cursor Lock

     0    606        1686 Ex_intent                      667720956           0      0 tempdb

        Non Cursor Lock

     0    606        1686 Ex_row                         667720956    46143296      0 tempdb

        Non Cursor Lock

     0    606        1686 Ex_intent                     1076770418           0      0 tempdb

        Non Cursor Lock

     0    606        1686 Ex_row-blk                    1076770418    13586806      2 tempdb

        Non Cursor Lock

     0    606        1686 Ex_row                        1076770418    13807873      0 tempdb

        Non Cursor Lock

     0    606        1686 Ex_intent                     1108770532           0      0 tempdb

        Non Cursor Lock

     0    606        1686 Ex_row                        1108770532    18465042      2 tempdb

        Non Cursor Lock

     0    606        1686 Ex_row                        1108770532    33477240      0 tempdb

        Non Cursor Lock

     0    606        1686 Ex_intent-blk                 1732772755           0      0 tempdb

        Non Cursor Lock

     0    606        1686 Ex_row                        1732772755    13475329      0 tempdb

        Non Cursor Lock

     0    606        1686 Ex_intent                     1764772869           0      0 tempdb

        Non Cursor Lock

     0    606        1686 Ex_row                        1764772869    19673089      0 tempdb

        Non Cursor Lock

(16 rows affected)

(return status = 0)

Step 4:- With above detail you need to send the mail to APP Team.

Take the approval from them and kill the open transaction .

 

Causes of Long-Running Transactions ?

Some of the causes for a long-running transaction include:

An incorrectly written update, insert, or delete statement that runs for many hours.

Commands that create Cartesian products or include user input are common mistakes in coding.

An application error that starts a transaction but never completes it.

Useful websites

https://www.yumpu.com/en/document/read/4306765/sybase-ase-15-best-practices-query-processing-optimization https://www.yumpu.com/en/document/...