Data masking quickstart¶
This quickstart gets the data masking component installed and lets you try it with a test database. You needaccess to a Percona Server for MySQL 8.4 server. Follow the steps to install, set permissions, create sample data, and run masking examples. For production use and limitations, see the Data masking overview.
Prerequisites¶
- Percona Server for MySQL (latest recommended).
- The
component_masking_functionslibrary in the server plugin directory. RunSHOW VARIABLES LIKE 'plugin_dir';and confirm the library exists in that directory. If it is missing, see Install the data masking component.
Install the component and set permissions¶
Do these once, in order. Full details: Install the data masking component.
-
Create the dictionary table (required before step 2):
CREATE TABLE IF NOT EXISTS mysql.masking_dictionaries( Dictionary VARCHAR(256) NOT NULL, Term VARCHAR(256) NOT NULL, UNIQUE INDEX dictionary_term_idx (Dictionary, Term) ) ENGINE = InnoDB DEFAULT CHARSET=utf8mb4; -
Install the component:
INSTALL COMPONENT 'file://component_masking_functions'; -
Grant
mysql.sessionaccess to the dictionary table (required forgen_dictionaryandgen_blockliston 8.4.4-1 and later):GRANT SELECT, INSERT, UPDATE, DELETE ON mysql.masking_dictionaries TO 'mysql.session'@'localhost'; -
Grant dictionary management to your user (required for the dictionary-based masking section below; skip if you will not try that section). Replace
<user>and<host>with the MySQL user you connect as (for example,rootandlocalhost):GRANT MASKING_DICTIONARIES_ADMIN ON *.* TO '<user>'@'<host>';No
FLUSH PRIVILEGESis needed; the grant takes effect immediately. If the privilege does not appear, reconnect to the server.
Create the test database and tables¶
Run the following script once to create the database and both tables with sample rows. The column types and values are chosen so you can try masking functions in the next sections.
CREATE DATABASE IF NOT EXISTS masking_demo;
USE masking_demo;
CREATE TABLE contacts (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100),
ssn VARCHAR(11),
card_no VARCHAR(19),
email VARCHAR(255),
phone VARCHAR(20),
notes VARCHAR(255)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO contacts (name, ssn, card_no, email, phone, notes)
VALUES
('Priya Sharma', '123-45-6789', '4111111111111111', 'priya.sharma@example.com', '1-555-010-1234', 'VIP'),
('Yuki Tanaka', '987-65-4321', '5500000000000004', 'yuki.tanaka@example.com', '1-555-010-5678', 'Newsletter'),
('Fatima Hassan', '111-22-3333', '340000000000009', 'fatima.hassan@example.com', '1-555-010-9012', NULL);
CREATE TABLE contacts_intl (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100),
country VARCHAR(50),
canada_sin VARCHAR(11),
uk_nin VARCHAR(20),
iban VARCHAR(40)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO contacts_intl (name, country, canada_sin, uk_nin, iban)
VALUES
('Jean Lefebvre', 'Canada', '506-948-819', NULL, NULL),
('Siobhan O''Brien', 'UK', NULL, 'CT 26 46 83 D', NULL),
('Hans Mueller', 'Germany', NULL, NULL, 'DE27 1002 02003 77495 4156');
Expected output
Query OK, 1 row affected (0.00 sec)
Database changed
Query OK, 0 rows affected (0.01 sec)
Query OK, 3 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 3 rows affected (0.00 sec)
View raw data¶
SELECT id, name, ssn, card_no, email, phone FROM contacts;
Expected output
+----+--------------+-------------+------------------+-------------------------+----------------+
| id | name | ssn | card_no | email | phone |
+----+--------------+-------------+------------------+-------------------------+----------------+
| 1 | Priya Sharma | 123-45-6789 | 4111111111111111 | priya.sharma@example.com| 1-555-010-1234 |
| 2 | Yuki Tanaka | 987-65-4321 | 5500000000000004 | yuki.tanaka@example.com | 1-555-010-5678 |
| 3 | Fatima Hassan| 111-22-3333 | 340000000000009 | fatima.hassan@example.com| 1-555-010-9012 |
+----+--------------+-------------+------------------+-------------------------+----------------+
3 rows in set (0.00 sec)
Mask sensitive columns in queries¶
The examples below use masking functions in SELECT; the table data is unchanged. To enforce masking for other users, use a view (see Expose masked data via a view) and restrict access to the base table. For limitations and security, see the Data masking overview.
Payment card and SSN¶
mask_pan leaves the last four digits visible; mask_ssn does the same for SSN.
SELECT
name,
mask_pan(card_no) AS card_masked,
mask_ssn(ssn) AS ssn_masked
FROM contacts;
Expected output
+--------------+------------------+-------------+
| name | card_masked | ssn_masked |
+--------------+------------------+-------------+
| Priya Sharma | XXXXXXXXXXXX1111 | XXX-XX-6789 |
| Yuki Tanaka | XXXXXXXXXXXX0004 | XXX-XX-4321 |
| Fatima Hassan| XXXXXXXXXX0009 | XXX-XX-3333 |
+--------------+------------------+-------------+
3 rows in set (0.00 sec)
Inner and outer parts of a string¶
mask_inner keeps the first and last character(s); mask_outer masks the left and right ends. The third and fourth arguments are the number of characters to keep unmasked on the left and right.
SELECT
name,
mask_inner(name, 1, 1) AS name_inner_masked,
mask_outer(phone, 2, 4) AS phone_outer_masked
FROM contacts;
Expected output
+--------------+------------------+------------------+
| name | name_inner_masked| phone_outer_masked|
+--------------+------------------+------------------+
| Priya Sharma | PXXXXXXXXXXa | XX5-010-1234 |
| Yuki Tanaka | YXXXXXXXXa | XX5-010-5678 |
| Fatima Hassan| FXXXXXXXXXXn | XX5-010-9012 |
+--------------+------------------+------------------+
3 rows in set (0.00 sec)
Generated values (for testing)¶
gen_rnd_email() and gen_rnd_ssn() return new values each time. They are useful for generating test data, not for masking existing values in place.
SELECT
id,
gen_rnd_email(5, 8, 'test.example') AS generated_email,
gen_rnd_ssn() AS generated_ssn
FROM contacts;
Expected output
+----+---------------------------+-------------+
| id | generated_email | generated_ssn|
+----+---------------------------+-------------+
| 1 | abcde.fghijkl@test.example| 912-34-5678 |
| 2 | vwxyz.abcdefg@test.example| 987-65-4321 |
| 3 | qwert.yuiopas@test.example| 923-45-6789 |
+----+---------------------------+-------------+
3 rows in set (0.00 sec)
Values will differ on each run. For constraints and caveats, see Data masking component functions and variables.
International data masking functions¶
The component includes functions for region-specific identifiers: Canadian Social Insurance Number (SIN) (mask_canada_sin), United Kingdom National Insurance Number (NIN) (mask_uk_nin), and International Bank Account Number (IBAN) (mask_iban). The script above already created the contacts_intl table. View the raw data, then apply the region-specific masking functions:
SELECT name, country, canada_sin, uk_nin, iban FROM contacts_intl;
Expected output
+----------------+---------+-------------+----------------+------------------------------+
| name | country | canada_sin | uk_nin | iban |
+----------------+---------+-------------+----------------+------------------------------+
| Jean Lefebvre | Canada | 506-948-819 | NULL | NULL |
| Siobhan O'Brien| UK | NULL | CT 26 46 83 D | NULL |
| Hans Mueller | Germany | NULL | NULL | DE27 1002 02003 77495 4156 |
+----------------+---------+-------------+----------------+------------------------------+
3 rows in set (0.00 sec)
SELECT
name,
country,
mask_canada_sin(canada_sin) AS canada_sin_masked,
mask_uk_nin(uk_nin) AS uk_nin_masked,
mask_iban(iban) AS iban_masked
FROM contacts_intl;
Expected output
+----------------+---------+------------------+------------------+----------------------------+
| name | country | canada_sin_masked| uk_nin_masked | iban_masked |
+----------------+---------+------------------+------------------+----------------------------+
| Jean Lefebvre | Canada | XXX-XXX-XXX | NULL | NULL |
| Siobhan O'Brien| UK | NULL | CT ** ** ** * | NULL |
| Hans Mueller | Germany | NULL | NULL | DE** **** **** **** **** |
+----------------+---------+------------------+------------------+----------------------------+
3 rows in set (0.00 sec)
Each function returns NULL when the input is NULL. For more options (for example, custom mask characters) and the corresponding generators (gen_rnd_canada_sin, gen_rnd_uk_nin, gen_rnd_iban), see Data masking component functions and variables.
Expose masked data via a view¶
To ensure that only masked data is visible, define a view that applies the masking functions and grant SELECT on the view (not on the base table) to users who should see masked data. Example:
CREATE VIEW contacts_masked AS
SELECT
id,
name,
mask_pan(card_no) AS card_no,
mask_ssn(ssn) AS ssn,
mask_outer(email, 2, 4) AS email,
mask_outer(phone, 2, 4) AS phone,
notes
FROM contacts;
SELECT id, name, card_no, ssn, email, phone FROM contacts_masked;
Expected output
+----+--------------+------------------+-------------+--------------------------+------------------+
| id | name | card_no | ssn | email | phone |
+----+--------------+------------------+-------------+--------------------------+------------------+
| 1 | Priya Sharma | XXXXXXXXXXXX1111 | XXX-XX-6789 | XXriya.sharma@exampXXXX | XX5-010-1234 |
| 2 | Yuki Tanaka | XXXXXXXXXXXX0004 | XXX-XX-4321 | XXuki.tanaka@exampXXXX | XX5-010-5678 |
| 3 | Fatima Hassan| XXXXXXXXXX0009 | XXX-XX-3333 | XXtima.hassan@examXXXX | XX5-010-9012 |
+----+--------------+------------------+-------------+--------------------------+------------------+
3 rows in set (0.00 sec)
The view uses mask_outer(..., 2, 4) for email and phone. Grant SELECT on the view (not the base table) to users who should see masked data only. For access control and limitations, see the Data masking overview.
Optional: dictionary-based masking¶
Functions such as gen_dictionary and gen_blocklist use the mysql.masking_dictionaries table.
You need the step 4 grant (MASKING_DICTIONARIES_ADMIN) and, on 8.4.4-1 and later, the step 3 grant to mysql.session described in the install guide.
Add a small dictionary with masking_dictionary_term_add, then try gen_dictionary:
SELECT masking_dictionary_term_add('roles', 'Engineer');
SELECT masking_dictionary_term_add('roles', 'Analyst');
SELECT masking_dictionary_term_add('roles', 'Manager');
SELECT id, name, gen_dictionary('roles') AS role FROM contacts;
Expected output (role column varies)
+----+--------------+----------+
| id | name | role |
+----+--------------+----------+
| 1 | Priya Sharma | Analyst |
| 2 | Yuki Tanaka | Manager |
| 3 | Fatima Hassan| Engineer |
+----+--------------+----------+
3 rows in set (0.00 sec)
To remove the dictionary when finished: SELECT masking_dictionary_remove('roles'); (masking_dictionary_remove). For large dictionaries or production use, see the Data masking overview and function list.
Clean up (optional)¶
To remove the test database and all its objects (tables, the contacts_masked view, and data):
DROP DATABASE IF EXISTS masking_demo;