MIRROR_WWW

The artifact of Real or Rogue? Detecting Malicious Miniapps with Deceptive Reporting Interface in WWW'26.

WWW'26 Artifact Evaluation Information: Instructions for checking this artifact is on Section 3.

1. Description of the project

This tool, "MIRAGE", is used to identify the Miniapps with Deceptive Reporting Interfaces. In general, the tool contains three components.

Component 0: analyze the entire miniapp package and generate the data flow graph.

Component 1A: Reconstruct the displayable contents of the pages and resolve the texts on each page

Component 1B: Reconstruct the data flow to resolve any domains associated with APIs to send information

Component 2: Similarity model to fit, embed, and infer the similarity between any given page to the official report interface.

2. Clarification on the availability, functionality, and reproducability

Availability: The artifact provides full availability of any code, dataset, and accompanying analysis result for the malware behavior. The code and analysis notes are included in this repository. The dataset is partially included in this repository to minimize the overhead to perform preliminary test on the artifact. A full dataset can be requested by following the instructions on https://minimalware.github.io/

As per requested by affected platforms with ownership of the miniapp packages, we cannot release the dataset without verifying their identity and ensuring ethical processing of the dataset.

Functionality: The released code should have full functionality, i.e., being able to execute and output expected forms of output.

The module of analyzing and reconstructing displayable contents and domain list will be written to scannedpages.csv and scannedurls.csv.
The results of similarity scores w.r.t. official interface for malware identification, as well as final results of identification, will be generated in data/ folder.

Reproducability: Due to the nature of large-scale analysis and the embedding model necessary for malware identification, the reproducability of this tool is evaluated separately. For the data reconstruction model output (1-run.py), the result is determinitstic and always hold the same across different combinations of miniapps to analyze. The results are reproducable, and the snippet containing the entries of the 5 miniapps from the original json file is attached in ground_truth for cross-check. For the similarity-based identification (2-analyze.py), the result may vary across execution and will vary with different corpus of miniapps to analyze. Despite that, this repository retains the script of training transformer model, setting threshold, and identifying malware utilizing this function for future researchers to work on. However, please bear in mind that as this repository provides only 5 miniapps to evaluate, the result of identification will differ from the paper. For any further inquiry, please contact the first author at https://frostwing98.com/.

3. Instruction to Use

To setup and use the tool, please follow the following steps.

3.1 Grant permission to scripts that can be of use:

chmod +x delete.sh
chmod +x decryp.sh
chmod +x restart.sh

3.2 Clean the repo

./restart.sh

3.3 Set up the venv

source ./venv/bin/activate
python3 -m venv ./venv

3.4 Install necessary libraries

Please bear in mind that this repo has NodeJS code. To reduce potential issues, the node_module is included. The users are welcomed to set up and install the required libraries on their own if needed.

For python, please install the following necessary libraries, depending on which function you want to perform. python3 -m pip install {necessary_lib}

3.4.1 List of required libraries for stage 1:

These libraries are required for executing the tool for analysis 1-run.py.

escodegen
graphviz
lxml

3.4.2 Optional if perform step 2

These libraries are required for identifying the malware based on similarity scores 2-analyze.py.

matplotlob
transformers
sentence_transformers
pandas
seaborn

3.5 Run the experiment

To perform checking, run

python3 1-run.py

There should be two files generated at the root folder. Kindly checkout scannedpages.csv and scannedurls.csv.

Optionally, to check how the malware is identified, run

python3 2-analyze.py

There should be miniapp_all_result1.csv and miniapp_all_result2.csv showing results of whether there are miniapps containing reporting interfaces similar to the one in figure 1 and figure 7 of the paper separately. Bear in mind that the threshold set for the full dataset, not the 5-miniapp dataset in this repo.

4. A Peek into The Project

This project includes 2 modules. The core module of MIRAGE detecting the MIRROR malware is located in myanalyzer, and the supporting module comprising basic miniapp analysis framework derived from CMRFScanner and TaintMini is located in pdg_js.

data and interm is used to store intermediate data of detection result and static analysis graphs.

ground_truth, artifact, and miniapps is used to store a partial set of the dataset for evaluating the functionality of the tool.

Due to certain inherited issue of the miniapp package unpacker, the wuWxapkg series of scripts have to be placed at the root folder, making it look messy.

The entire topology is as follows.

MIRAGE # root folder
├─ artifact/
│  └─ appids.csv      #  index list for appids to analyze
│
├─ miniapps/          # miniapp packages
│
├─ interm/            # directory for MIRAGE interm data
│
├─ data/              # folder storing final results
│
├─ ground_truth/      # info for validating reproducability
│
├─ metainfo/          # detailed info for the dataset
│
├─ myanalyzer/        # CORE MIRAGE FUNCTIONALITIES
│  ├─ cfg.py          # data flow analysis module
│  ├─ wxmlparser.py   # WXML adaptive module
│  └─ main_thread.py  # main entrance script
│
├─ pdg_js/            # supporting Miniapp Analyzer utility
│
├─ 1-run.py           # script for phase 1 validation
├─ 2-analyze.py       # script for phase 2 validation
│
├─ main_similarity.py # SentenceBERT embedding generation
├─ testsimilarity.py  # similarity score generation
├─ readcsv.py         # csv processing & malware detection
│
├─ 5-url-assigner.py  # multicore batch script (server)
├─ 5-url-worker.py    # multicore batch script (client)
│
├─ decrypt.sh         # package unpacker shell
├─ wuWxapkg.js        # miniapp package unpacker
├─ restart.sh         # directory initialization
├─ delete.sh          # unpacked miniapp remover

5. Dataset Hosting

We host the dataset on Google Drive. However, given the sensitive nature of these packages, as per requested by the affected platform during our responsible disclosure, the dataset shall not be released to malicious party. Hence, an agreement on a mandatory identity and motivation check is enforced. To access the full dataset, please follow the instruction on https://minimalware.github.io/.

Please note that the dataset is shared on a "in dubio pro reo" basis, i.e., we trust the users who access the dataset is using them for research purpose and will delete after the research is finished. Therefore, we require self-identification and motivation statements.

However, we retain the rights to withdraw access to the dataset should ethical concerns arise, or petition from original owners of the dataset is received.

6. Acknowledgements

We thank Dr. Yue Zhang for his time and advices contributed to this research. We also thank Chao Wang for his effort maintaining the cloud infrastructure that is fundamental to this research. We thank Tencent and WeChat Security Team for providing initial examples and valuable insights that initiates this research. And we thank Dr. Chaoshun Zuo and Dr. Yue Zhang for helping us obtain the dataset of miniapp packages.

In addition, Dr. Yuqing Yang particularly thanks Mr. Yakun Yang and Ms. Jing Feng, his parents, for helping him evaluate generality of the research by testing whether there are geographic differences for oversea versions of Super Apps.

7. Reference

If this work benefits your research, kindly use the following bibtex entry for endorsement.

@inproceedings{mirror,
  author       = {Yuqing Yang and Zhiqiang Lin},
  title        = {Rogue? Detecting Malicious Miniapps with Deceptive Reporting Interface},
  booktitle    = {Proceedings of the {ACM} on Web Conference 2026, {WWW} 2026, Dubai, United Arab Emirates, April 13-17, 2026},
  publisher    = {{ACM}},
  year         = {2026},
  url          = {https://doi.org/10.1145/3774904.3792470},
  doi          = {10.1145/3774904.3792470},
}

8. Contact

Additional inquiries can be redirected to the first author's email. Kindly refer to https://frostwing98.com/

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
artifact		artifact
data		data
ground_truth		ground_truth
interm/intermediate-data/iniapps		interm/intermediate-data/iniapps
metainfo		metainfo
miniapps		miniapps
myanalyzer		myanalyzer
node_modules		node_modules
pdg_js		pdg_js
.DS_Store		.DS_Store
.gitignore		.gitignore
1-run.py		1-run.py
2-analyze.py		2-analyze.py
5-url-assigner.py		5-url-assigner.py
5-url-worker.py		5-url-worker.py
LICENSE		LICENSE
README.md		README.md
arch.png		arch.png
decrypt.sh		decrypt.sh
delete.sh		delete.sh
findnext.py		findnext.py
main.py		main.py
main_similarity.py		main_similarity.py
readcsv.py		readcsv.py
restart.sh		restart.sh
scannedpages.csv		scannedpages.csv
scannedurls.csv		scannedurls.csv
test2.js		test2.js
testsimilarity.py		testsimilarity.py
unpack.py		unpack.py
wuConfig.js		wuConfig.js
wuJs.js		wuJs.js
wuLib.js		wuLib.js
wuRestoreZ.js		wuRestoreZ.js
wuWxapkg-new.js		wuWxapkg-new.js
wuWxapkg.js		wuWxapkg.js
wuWxml.js		wuWxml.js
wuWxss.js		wuWxss.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MIRROR_WWW

1. Description of the project

2. Clarification on the availability, functionality, and reproducability

3. Instruction to Use

3.1 Grant permission to scripts that can be of use:

3.2 Clean the repo

3.3 Set up the venv

3.4 Install necessary libraries

3.4.1 List of required libraries for stage 1:

3.4.2 Optional if perform step 2

3.5 Run the experiment

4. A Peek into The Project

5. Dataset Hosting

6. Acknowledgements

7. Reference

8. Contact

About

Uh oh!

Releases

Packages

Languages

License

OSUSecLab/MIRROR_WWW

Folders and files

Latest commit

History

Repository files navigation

MIRROR_WWW

1. Description of the project

2. Clarification on the availability, functionality, and reproducability

3. Instruction to Use

3.1 Grant permission to scripts that can be of use:

3.2 Clean the repo

3.3 Set up the venv

3.4 Install necessary libraries

3.4.1 List of required libraries for stage 1:

3.4.2 Optional if perform step 2

3.5 Run the experiment

4. A Peek into The Project

5. Dataset Hosting

6. Acknowledgements

7. Reference

8. Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages