ЕСОЗ - публічна документація
Offline comparison of death acts with Persons_EN
Purpose
This process fetches death acts data from mimir service and uses it for comparison with persons data.
Key points
This process uses cron parameter to configure its start time.
This process is performed in mpi scheduler pod.
Configuration
Value | Description | Example |
---|---|---|
PERSONS_DRACS_DEATH_ACTS_COMPARE_SCHEDULE | Cron parameter, represents start time of offline comparison of death acts with persons process |
|
PERSONS_DRACS_DEATH_ACTS_COMPARE_BATCH_SIZE | Number of death acts that will be processed in one cycle of the process |
|
PERSON_DRACS_DEATH_ACT_FULL_MATCH_SCORE | Upper limit of comparison score for grey zone, lower limit of comparison score for white zone |
|
PERSON_DRACS_DEATH_ACT_PARTIAL_MATCH_SCORE | Upper limit of comparison score for black zone, lower limit of comparison score for grey zone |
|
Service logic
Step 1. Death acts list for comparison
Get list of death acts from that are ready to be compared to persons - perform RPC call to mimir service,
dracs_death_acts
table, with following parameters:act_record_operation_name = ‘1' or '4’
persons_compare_status = ‘READY’
Sort obtained list in ascending order by
dracs_death_acts.updated_at
fieldLimit obtained list with DRACS_DEATH_ACTS_SELECT_BATCH_SIZE value
In case no death acts were fetched - end process.
Step 2. Data preparation and potential candidates select
Get death act from obtained list, update its status in dracs_death_acts table:
set persons_compare_status = ‘IN_PROCESS’
Prepare death act data for comparison process
perform regexp for each of the fields:
for name, surname and patronymic change:
[ --'] to ''
'є' to 'е'
'и' to 'і'
for sex change:
1 to MALE
2 to FEMALE
everything else to
null
for doc_seizes.series_numb change:
[ /%#№ _-] to ''
for date_birth change mask from
dd.mm.yyyy
toyyyy-mm-dd
for numident normalize value - check that it is 10 symbols, if not true - set null
Get active persons (is_active=true, status=active) as candidates for comparison with death act from MPI db using following predicate blocks:
tax_id
documents.number
birth_date + last_name
settlement_id + last_name
In case no candidates were found, update death acts persons compare status:
set persons_compare_status = ‘PROCESSED’
Go to next death act in obtained list (return to p.1 of Step 2)
Step 3. Death act comparison process
Get person from obtained candidates list
Prepare persons data for comparison process accoding to Deduplication process NEW | DeduplicationprocessNEW Datacleaningandpreparation
Compare death act data with person data using logistic regression method, as implemented in Deduplication process:
For each variable field use separate calculation process based on the table below.
Calculate final comparison score between death act and person.
Variable | Description | persons | dracs death act |
---|---|---|---|
d_first_name | levenshtein distance(first_name1, first_name2) | first_name | name |
d_last_name | levenshtein distance(last_name1, last_name2) | last_name | surname |
d_second_name | levenshtein distance(second_name1, second_name2) | second_name | patronymic |
d_documents_bin | min(levenshtein distance(document1, document2)) for any types of documents | person_documents.number | doc_seizes.series_numb |
docs_same_number_bin | min(same/not) number | person_documents.number | doc_seizes.series_numb |
birth_settlement_substr | min(position(birth_settlement_1 in birth_settlementt_2) and position(birth_settlement_2 in birth_settlementt_1) | birth_settlement | birth_locality (if not null or birth_locality_type <> ‘Район’), else birth_district (if not null), else birth_region |
authentication_methods | same/not authentification OTP number flag | person_authentication_methods.phone_number where type = OTP | - |
residence_settlement_flag | same/not residence settlement flag | person_adresses.settlement | locality (if not null or locality_type <> ‘Район’), else district (if not null), else region |
d_tax_id | levenshtein distance(tax_id1, tax_id2) | tax_id | numident |
gender_flag | same/not gender | gender | sex |
twins_flag | distance last_name <=2, same birth_date, distance in document numbers between 1 and 2 | last_name, first_name, birth_date, person_documents.number | surname, name, date_birth, doc_seizes.series_numb |
Step 4. Verification candidates
Before performing this step, check that death act record is still in persons_compare_status = 'IN_PROCESS'.
In case status is changed - skip this step, go to next death act in obtained list (return to p.1 of Step 2).
Based on death act and person comparison score, there are three flows that can be performed:
white zone (comparison score is greater than PERSON_DRACS_DEATH_ACT_FULL_MATCH_SCORE value)
grey zone (comparison score is between PERSON_DRACS_DEATH_ACT_PARTIAL_MATCH_SCORE and PERSON_DRACS_DEATH_ACT_FULL_MATCH_SCORE values)
black zone (comparison score is less than PERSON_DRACS_DEATH_ACT_PARTIAL_MATCH_SCORE value)
Step 4.1. White zone
White zone indicates that death act relates to person, therefore it can be stated that person highly likely is deceased and must be deactivated.
Create verification candidate between death act and person in MPI db,
person_verification_candidates
table, set values:id = autogenerate uuid
person_id = id of a person from
mpi.persons
entity_id = id of death act from
mimir.dracs_death_acts
entity_type = ‘dracs_death_act’
status = ‘CONFIRMED’
config = variables that were used in comparison process (p.3 of Step 3)
details = additional details of comparison process
score = logistic regression comparison score
inserted_at = now()
updated_at = now()
Deactivate person
set persons.status = inactive, persons.updated_at = now()
terminate active declarations for person
deactivate active persons authentication methods
revert active verification candidates for person in
person_verification_candidates
table by person_idset status = ‘NOT_CONFIRMED’
set updated_at = now()
Update persons verification status in
person_verifications
table:set dracs_death_act_id = id of death act from
mimir.dracs_death_acts
set dracs_death_verification_status = ‘VERIFIED’
set dracs_death_verification_reason = ‘AUTO_OFFLINE’
set dracs_death_online_status = ‘COMPLETED’
set updated_at = now()
set updated_by = system_user()
Step 4.2. Grey zone
Grey zone indicates that death act possibly relates to person, therefore it can be stated that person may be deceased and must be inspected by doctor or NHS data steward (or both).
Create verification candidate between death act and person in MPI db,
person_verification_candidates
table, set values:id = autogenerate uuid
person_id = id of a person from
mpi.persons
entity_id = id of death act from
mimir.dracs_death_acts
entity_type = ‘dracs_death_act’
status = ‘NEW’
config = variables that were used in comparison process (p.3 of Step 3)
details = additional details of comparison process
score = logistic regression comparison score
inserted_at = now()
updated_at = now()
Update persons verification status in
person_verifications
table based on current dracs death verification status of person:in case dracs_death_verification_status = ‘NOT_VERIFIED’ - do not update persons dracs death verification status
in case dracs_death_verification_status = ‘VERIFICATION_NEEDED’ and dracs_death_verification_reason = ‘MANUAL_NOT_CONFIRMED’ - do not update persons dracs death verification status
in case dracs_death_verification_status = ‘VERIFICATION_NEEDED’ and dracs_death_verification_reason = ‘MANUAL_CONFIRMED’ - do not update persons dracs death verification status
in case dracs_death_verification_status = ‘VERIFICATION_NEEDED’ and dracs_death_verification_reason = ‘ONLINE_TRIGGERED’ - update persons dracs death verification status:
set dracs_death_verification_status = ‘NOT_VERIFIED’
set dracs_death_verification_reason = ‘AUTO_OFFLINE’
set updated_at = now()
set updated_by = system_user()
in case dracs_death_verification_status = ‘VERIFICATION_NEEDED’ and dracs_death_verification_reason = ‘INITIAL’ - update persons dracs death verification status:
set dracs_death_verification_status = ‘NOT_VERIFIED’
set dracs_death_verification_reason = ‘AUTO_OFFLINE’
set updated_at = now()
set updated_by = system_user()
in case dracs_death_verification_status = ‘IN_REVIEW’ and dracs_death_verification_reason = ‘MANUAL’ - do not update persons dracs death verification status
in case dracs_death_verification_status = ‘VERIFIED’ - update persons dracs death verification status:
set dracs_death_verification_status = ‘NOT_VERIFIED’
set dracs_death_verification_reason = ‘AUTO_OFFLINE’
set updated_at = now()
set updated_by = system_user()
Step 4.3. Black zone
Black zone indicates that death act does not relate to person, therefore it can be stated that person highly likely is not deceased.
No further actions should be taken with person or verification candidates.
Step 5. Processed death act
When death act is fully compares with all candidates from list, change its status in
mimir.dracs_death_acts
tableset persons_compare_status = ‘PROCESSED’
Go to next death act in obtained list (return to Step 2)
ЕСОЗ - публічна документація