ЕСОЗ - публічна документація

RC_Offline comparison of marriage acts with Persons (DRACS 2.0)

Purpose

This process fetches marriage acts data from mimir service and uses it for comparison with persons data.

Key points

  1. This process uses cron parameter to configure its start time.

  2. This process is performed in mpi scheduler pod.

  3. Each marriage act is compared by husband data and by wife data separately.

Configuration

Value

Description

Example

Value

Description

Example

PERSONS_DRACS_MARRIAGE_ACTS_COMPARE_SCHEDULE

Cron parameter, represents start time of offline comparison of marriage acts with persons process

*/10 * * * *

PERSONS_DRACS_MARRIAGE_ACTS_COMPARE_BATCH_SIZE

Number of marriage acts that will be processed in one cycle of the process

100

PERSON_DRACS_MARRIAGE_ACT_PARTIAL_MATCH_SCORE

Upper limit of comparison score for black zone, lower limit of comparison score for grey zone

0.7

Service logic

Offline comparison job starts according to PERSONS_DRACS_MARRIAGE_ACTS_COMPARE_SCHEDULE cron parameter.

Step 1. Marriage acts list for comparison

  1. Get list of marriage acts from that are ready to be compared to persons - perform RPC call to mimir service, dracs_marriage_acts table, with following parameters:

    1. ar_op_name = ‘1' or '4’

    2. persons_compare_status = ‘READY’

  2. Sort obtained list in ascending order by dracs_marriage_acts.updated_at field

  3. Limit obtained list with PERSONS_DRACS_MARRIAGE_ACTS_COMPARE_BATCH_SIZE value

  4. In case no marriage acts were fetched - end process.

Step 2. Data preparation and potential candidates select

  1. Get marriage act from obtained list, update its status in dracs_marriage_acts table:

    1. set persons_compare_status = ‘IN_PROCESS’

  2. Get husband data for comparison process if husband_old_surname <> husband_surname and these fields are not empty

  3. Get wife data for comparison process if wife_old_surname <> wife_surname and these fields are not empty

  4. Prepare marriage act data for comparison process

    1. perform regexp for each of the fields:

      1. for husband_name, wife_name, husband_old_surname, wife_old_surname, husband_patronymic and wife_patronymic change:

        1. [ --'] to ''

        2. 'є' to 'е'

        3. 'и' to 'і'

      2. for husband_series_numb and wife_series_numb change:

        1. [ /%#№ _-]  to ''

    2. for husband_date_birth and wife_date_birth change mask from dd.mm.yyyy to yyyy-mm-dd

    3. for husband_numident and wife_numident normalize value - check that it is 10 symbols, if not true - set null

  5. Get active persons (is_active=true, status=active) as candidates for comparison with husband or wife data from marriage act from MPI db using following predicate blocks:

    1. tax_id

    2. documents.number

    3. birth_date + last_name

    4. settlement_id + last_name

  6. In case no candidates were found for husband and/or wife, update marriage acts persons compare status:

    1. set persons_compare_status = ‘PROCESSED’

  7. Go to next marriage act in obtained list (return to p.1 of Step 2)

Step 3. Marriage act comparison process

  1. Get person from obtained candidates list

  2. Prepare persons data for comparison process according to Deduplication process NEW | Data cleaning and preparation

  3. Define person’s document for Deduplication process:

    1. If passport document type is present in the act (DocType=1), then choose person’s PASSPORT or NATIONAL_ID depending of the passport number in the act by regex.

    2. If other document type is present in the act, then submit a list of documents to deduplication process: BIRTH_CERTIFICATE, COMPLEMENTARY_PROTECTION_CERTIFICATE PERMANENT_RESIDENCE_PERMIT, REFUGEE_CERTIFICATE, TEMPORARY_CERTIFICATE, TEMPORARY_PASSPORT

  4. Compare marriage act data with person data using logistic regression method, as implemented in Deduplication process:

    1. For each variable field use separate calculation process based on the table below.

    2. Calculate final comparison score between marriage act and person.

Variable

Description

persons

dracs marriage act

Variable

Description

persons

dracs marriage act

d_first_name

levenshtein distance(first_name1, first_name2)

first_name

husband_name

d_last_name

levenshtein distance(last_name1, last_name2)

last_name

husband_old_surname

d_second_name

levenshtein distance(second_name1, second_name2)

second_name

husband_patronymic

d_documents_bin

min(levenshtein distance(document1, document2)) for any types of documents

person_documents.number

husband_series_numb

docs_same_number_bin

min(same/not) number

person_documents.number

husband_series_numb

birth_settlement_substr

min(position(birth_settlement_1 in birth_settlementt_2) and position(birth_settlement_2 in birth_settlementt_1)

birth_settlement

husband_birth_locality (if not null or husband_birth_locality_type <> ‘Район’), else husband_birth_district (if not null), else husband_birth_region

authentication_methods

same/not authentification OTP number flag

person_authentication_methods.phone_number where type = OTP

-

residence_settlement_flag

same/not residence settlement flag

person_adresses.settlement

husband_locality (if not null or husband_locality_type <> ‘Район’), else husband_district (if not null), else husband_region

d_tax_id

levenshtein distance(tax_id1, tax_id2)

tax_id

husband_numident

gender_flag

same/not gender

gender

‘MALE’ (const)

twins_flag

distance last_name <=2, same birth_date, distance in document numbers between 1 and 2

last_name,

first_name,

birth_date,

person_documents.number

husband_old_surname,

husband_name,

husband_date_birth,

husband_series_numb

Variable

Description

persons

dracs marriage act

Variable

Description

persons

dracs marriage act

d_first_name

levenshtein distance(first_name1, first_name2)

first_name

wife_name

d_last_name

levenshtein distance(last_name1, last_name2)

last_name

wife_old_surname

d_second_name

levenshtein distance(second_name1, second_name2)

second_name

wife_patronymic

d_documents_bin

min(levenshtein distance(document1, document2)) for any types of documents

person_documents.number

wife_series_numb

docs_same_number_bin

min(same/not) number

person_documents.number

wife_series_numb

birth_settlement_substr

min(position(birth_settlement_1 in birth_settlementt_2) and position(birth_settlement_2 in birth_settlementt_1)

birth_settlement

wife_birth_locality (if not null or wife_birth_locality_type <> ‘Район’), else wife_birth_district (if not null), else wife_birth_region

authentication_methods

same/not authentification OTP number flag

person_authentication_methods.phone_number where type = OTP

-

residence_settlement_flag

same/not residence settlement flag

person_adresses.settlement

wife_locality (if not null or wife_locality_type <> ‘Район’), else wife_district (if not null), else wife_region

d_tax_id

levenshtein distance(tax_id1, tax_id2)

tax_id

wife_numident

gender_flag

same/not gender

gender

‘FEMALE’ (const)

twins_flag

distance last_name <=2, same birth_date, distance in document numbers between 1 and 2

last_name,

first_name,

birth_date,

person_documents.number

wife_old_surname,

wife_name,

wife_date_birth,

wife_series_numb

Step 4. Verification candidates

Before performing this step, check that marriage act record is still in persons_compare_status = 'IN_PROCESS'.
In case status is changed - skip this step, go to next marriage act in obtained list (return to p.1 of Step 2).

Based on marriage act and person comparison score, there are two flows that can be performed:

  • white zone or grey zone (comparison score is greater than PERSON_DRACS_MARRIAGE_ACT_PARTIAL_MATCH_SCORE value)

  • black zone (comparison score is less than PERSON_DRACS_MARRIAGE_ACT_PARTIAL_MATCH_SCORE value)

Step 4.1. White or grey zone

White zone or grey zone indicates that marriage act possibly relates to person, therefore it can be stated that person change its surname after marriage and it can be used by doctor as supporting information for update person process.

  1. Create verification candidate between marriage act and person in MPI db, person_verification_candidates table, set values:

    1. id = autogenerate uuid

    2. person_id = id of a person from mpi.persons

    3. entity_id = id of marriage act from mimir.dracs_marriage_acts

    4. entity_type = ‘dracs_marriage_act’

    5. status = ‘NEW’

    6. config = variables that were used in comparison process (p.3 of Step 3)

    7. details = additional details of comparison process

    8. score = logistic regression comparison score

    9. inserted_at = now()

    10. updated_at = now()

      1. if active verification candidate between person_id and dracs_marriage_act_id already exists in person_verification_candidates table in status = ‘NEW’ - skip this pair

  2. Update persons verification status in person_verifications table based on current dracs name change verification status of person:

    1. in case dracs_name_change_verification_status = ‘VERIFICATION_NOT_NEEDED’ or ‘VERIFIED’ - update persons dracs name change verification status:

      1. set dracs_name_change_verification_status = ‘VERIFICATION_NEEDED’

      2. set dracs_name_change_verification_reason = ‘AUTO_OFFLINE’

      3. set updated_at = now()

      4. set updated_by = system_user()

Step 4.2. Black zone

Black zone indicates that marriage act does not relate to person, therefore it can be stated that person highly likely did not change its last name.
No further actions should be taken with person or verification candidates.

Step 5. Processed marriage act

  1. When marriage act is fully compared with all candidates from list, change its status in mimir.dracs_marriage_acts table

    1. set persons_compare_status = ‘PROCESSED’

  2. Go to next marriage act in obtained list (return to Step 2)

ЕСОЗ - публічна документація