ЕСОЗ - публічна документація

RC_Offline comparison of divorce acts with Persons (DRACS 2.0)

Purpose

This process fetches divorce acts data from mimir service and uses it for comparison with persons data.

Key points

  1. This process uses cron parameter to configure its start time.

  2. This process is performed in mpi scheduler pod.

  3. Each divorce act is compared by husband data and by wife data separately.

Configuration

Value

Description

Example

Value

Description

Example

PERSONS_DRACS_DIVORCE_ACTS_COMPARE_SCHEDULE

Cron parameter, represents start time of offline comparison of divorce acts with persons process

*/10 * * * *

PERSONS_DRACS_DIVORCE_ACTS_COMPARE_BATCH_SIZE

Number of divorce acts that will be processed in one cycle of the process

100

PERSON_DRACS_DIVORCE_ACT_PARTIAL_MATCH_SCORE

Upper limit of comparison score for black zone

0.7

Service logic

Offline comparison job starts according to PERSONS_DRACS_DIVORCE_ACTS_COMPARE_SCHEDULE cron parameter.

Step 1. Divorce acts list for comparison

  1. Get list of divorce acts from that are ready to be compared to persons - perform RPC call to mimir service, dracs_divorce_acts table, with following parameters:

    1. ar_op_name = ‘1' or '4’

    2. persons_compare_status = ‘READY’

  2. Sort obtained list in ascending order by dracs_divorce_acts.updated_at field

  3. Limit obtained list with PERSONS_DRACS_DIVORCE_ACTS_COMPARE_BATCH_SIZE value

  4. In case no divorce acts were fetched - end process.

Step 2. Data preparation and potential candidates select

  1. Get divorce act from obtained list, update its status in dracs_divorce_acts table:

    1. set persons_compare_status = ‘IN_PROCESS’

  2. Get husband data for comparison process if mn_old_surname <> mn_surname and these fields are not empty

  3. Get wife data for comparison process if wmn_old_surname <> wmn_surname and these fields are not empty

  4. Prepare divorce act data for comparison process

    1. perform regexp for each of the fields:

      1. for mn_name, wmn_name, mn_old_surname, wmn_old_surname, mn_patronymic and wmn_patronymic change:

        1. [ --'] to ''

        2. 'є' to 'е'

        3. 'и' to 'і'

      2. for mn_series_numb and wmn_series_numb change:

        1. [ /%#№ _-]  to ''

    2. for mn_date_birth and wmn_date_birth change mask from dd.mm.yyyy to yyyy-mm-dd

    3. for mn_numident and wmn_numident normalize value - check that it is 10 symbols, if not true - set null

  5. Get active persons (is_active=true, status=active) as candidates for comparison with husband or wife data from divorce act from MPI db using following predicate blocks:

    1. tax_id

    2. documents.number

    3. birth_date + last_name

    4. settlement_id + last_name

  6. In case no candidates were found for husband and/or wife, update divorce acts persons compare status:

    1. set persons_compare_status = ‘PROCESSED’

  7. Go to next divorce act in obtained list (return to p.1 of Step 2)

Step 3. Divorce act comparison process

  1. Get person from obtained candidates list

  2. Prepare persons data for comparison process according to Deduplication process NEW | Data cleaning and preparation

  3. Define person’s document for Deduplication process:

    1. If passport document type is present in the act (DocType=1), then choose person’s PASSPORT or NATIONAL_ID depending of the passport number in the act by regex.

    2. If other document type is present in the act, then submit a list of documents to deduplication process: BIRTH_CERTIFICATE, COMPLEMENTARY_PROTECTION_CERTIFICATE PERMANENT_RESIDENCE_PERMIT, REFUGEE_CERTIFICATE, TEMPORARY_CERTIFICATE, TEMPORARY_PASSPORT

  4. Compare divorce act data with person data using logistic regression method, as implemented in Deduplication process:

    1. For each variable field use separate calculation process based on the table below.

    2. Calculate final comparison score between divorce act and person.

Variable

Description

persons

dracs divorce act

Variable

Description

persons

dracs divorce act

d_first_name

levenshtein distance(first_name1, first_name2)

first_name

mn_name

d_last_name

levenshtein distance(last_name1, last_name2)

last_name

mn_old_surname

d_second_name

levenshtein distance(second_name1, second_name2)

second_name

mn_patronymic

d_documents_bin

min(levenshtein distance(document1, document2)) for any types of documents

person_documents.number

mn_series_numb

docs_same_number_bin

min(same/not) number

person_documents.number

mn_series_numb

birth_settlement_substr

min(position(birth_settlement_1 in birth_settlementt_2) and position(birth_settlement_2 in birth_settlementt_1)

birth_settlement

mn_birth_locality (if not null or mn_birth_locality_type <> ‘Район’), else mn_birth_district (if not null), else mn_birth_region

authentication_methods

same/not authentification OTP number flag

person_authentication_methods.phone_number where type = OTP

-

residence_settlement_flag

same/not residence settlement flag

person_adresses.settlement

mn_locality (if not null or mn_locality_type <> ‘Район’), else mn_district (if not null), else mn_region

d_tax_id

levenshtein distance(tax_id1, tax_id2)

tax_id

mn_numident

gender_flag

same/not gender

gender

‘MALE’ (const)

twins_flag

distance last_name <=2, same birth_date, distance in document numbers between 1 and 2

last_name,

first_name,

birth_date,

person_documents.number

mn_old_surname,

mn_name,

mn_date_birth,

mn_series_numb

Variable

Description

persons

dracs divorce act

Variable

Description

persons

dracs divorce act

d_first_name

levenshtein distance(first_name1, first_name2)

first_name

wmn_name

d_last_name

levenshtein distance(last_name1, last_name2)

last_name

wmn_old_surname

d_second_name

levenshtein distance(second_name1, second_name2)

second_name

wmn_patronymic

d_documents_bin

min(levenshtein distance(document1, document2)) for any types of documents

person_documents.number

wmn_series_numb

docs_same_number_bin

min(same/not) number

person_documents.number

wmn_series_numb

birth_settlement_substr

min(position(birth_settlement_1 in birth_settlementt_2) and position(birth_settlement_2 in birth_settlementt_1)

birth_settlement

wmn_birth_locality (if not null or wmn_birth_locality_type <> ‘Район’), else wmn_birth_district (if not null), else wmn_birth_region

authentication_methods

same/not authentification OTP number flag

person_authentication_methods.phone_number where type = OTP

-

residence_settlement_flag

same/not residence settlement flag

person_adresses.settlement

wmn_locality (if not null or wmn_locality_type <> ‘Район’), else wmn_district (if not null), else wmn_region

d_tax_id

levenshtein distance(tax_id1, tax_id2)

tax_id

wmn_numident

gender_flag

same/not gender

gender

‘FEMALE’ (const)

twins_flag

distance last_name <=2, same birth_date, distance in document numbers between 1 and 2

last_name,

first_name,

birth_date,

person_documents.number

wmn_old_surname,

wmn_name,

wmn_date_birth,

wmn_series_numb

Step 4. Verification candidates

Before performing this step, check that divorce act record is still in persons_compare_status = 'IN_PROCESS'.
In case status is changed - skip this step, go to next divorce act in obtained list (return to p.1 of Step 2).

Based on divorce act and person comparison score, there are two flows that can be performed:

  • white zone or grey zone (comparison score is greater than PERSON_DRACS_DIVORCE_ACT_PARTIAL_MATCH_SCORE value)

  • black zone (comparison score is less than PERSON_DRACS_DIVORCE_ACT_PARTIAL_MATCH_SCORE value)

Step 4.1. White or grey zone

White or grey zone indicates that divorce act possibly relates to person, therefore it can be stated that person changed its surname after divorce and it can be used by doctor as supporting information for update person process.

  1. Create verification candidate between divorce act and person in MPI db, person_verification_candidates table, set values:

    1. id = autogenerate uuid

    2. person_id = id of a person from mpi.persons

    3. entity_id = id of divorce act from mimir.dracs_divorce_acts

    4. entity_type = ‘dracs_divorce_act’

    5. status = ‘NEW’

    6. config = variables that were used in comparison process (p.3 of Step 3)

    7. details = additional details of comparison process

    8. score = logistic regression comparison score

    9. inserted_at = now()

    10. updated_at = now()

      1. if active verification candidate between person_id and dracs_divorce_act_id already exists in person_verification_candidates table in status = ‘NEW’ - skip this pair

  2. Update persons verification status in person_verifications table based on current dracs name change verification status of person:

    1. in case dracs_name_change_verification_status = ‘VERIFICATION_NOT_NEEDED’ or ‘VERIFIED’ - update persons dracs name change verification status:

      1. set dracs_name_change_verification_status = ‘VERIFICATION_NEEDED’

      2. set dracs_name_change_verification_reason = ‘AUTO_OFFLINE’

      3. set updated_at = now()

      4. set updated_by = system_user()

Step 4.2. Black zone

Black zone indicates that divorce act does not relate to person, therefore it can be stated that person highly likely did not change its last name.
No further actions should be taken with person or verification candidates.

Step 5. Processed divorce act

  1. When divorce act is fully compares with all candidates from list, change its status in mimir.dracs_divorce_acts table

    1. set persons_compare_status = ‘PROCESSED’

  2. Go to next divorce act in obtained list (return to Step 2)

ЕСОЗ - публічна документація