Data trace as the scientific foundation for trusted metrological data: a review for future metrology direction

Zhanshuo Cao; Boyong Gao; Zilong Liu; Xingchuang Xiong; Bin Wang; Chenbo Pei

doi:10.7717/peerj-cs.3106

Data trace as the scientific foundation for trusted metrological data: a review for future metrology direction

Zhanshuo Cao¹, Boyong Gao¹, Zilong Liu ^2,3, Xingchuang Xiong^2,3, Bin Wang^2,3, Chenbo Pei^2,3

1China Jiliang University, Hangzhou, Zhejiang, China

2Center for Metrology Scientific Data, National Institute of Metrology, Beijing, China

3Key Laboratory of Metrology Digitalization and Digital Metrology, State Administration for Market Regulation, Beijing, China

DOI: 10.7717/peerj-cs.3106

Published: 2025-08-14
Accepted: 2025-07-15
Received: 2024-12-23

Academic Editor: Ana Maguitman

Subject Areas: Data Science, Security and Privacy
Keywords: Data trace, Digital metrology, Tamper detection, Data forensics, File system, Data storage

Copyright: © 2025 Cao et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.

Cite this article: Cao Z, Gao B, Liu Z, Xiong X, Wang B, Pei C. 2025. Data trace as the scientific foundation for trusted metrological data: a review for future metrology direction. PeerJ Computer Science 11:e3106 https://doi.org/10.7717/peerj-cs.3106

The authors have chosen to make the review history of this article public.

Abstract

In the context of the digital transformation of metrology, ensuring the trustworthiness and integrity of measurement data during its generation, transmission, and storage—i.e., trustworthy detection of measurement data—has become a critical challenge. Data traces are residual marks left during the data processing, which help identify malicious activities targeting measurement data. These traces are especially important when the trust and integrity of potential data evidence are under threat. To this end, this article systematically reviews relevant core techniques and analyzes various detection methods across the different stages of the data lifecycle, evaluating their applicability and limitations in identifying data tampering, unauthorized access, and anomalous operations. The findings suggest that trace detection technologies can enhance the traceability and transparency of metrological data, thereby providing technical support for building a trustworthy digital metrology system. This review lays the theoretical foundation for future research on developing automated anomaly detection models, improving forensic techniques for data tampering in measurement devices, and constructing multi-modal, full-lifecycle traceability frameworks for measurement data. Subsequent studies should focus on aligning these technologies with metrological standards and verifying their deployment in real-world measurement instruments.

Introduction

Metrology, as a fundamental discipline that ensures the accuracy, reliability, and consistency of measurement results in scientific and industrial applications, serves as a vital support for the national strategic framework (Barbosa et al., 2022). With the continuous advancement of metrological science, digital transformation has become a key pathway to improving the efficiency and precision of measurement processes (Eichstädt, Keidel & Tesch, 2021; Toro & Lehmann, 2021; Rab et al., 2023). This transformation leverages modern technologies to enable the digital processing of measurement data, thereby enhancing the efficiency and accuracy of data acquisition, analysis, and management. In this context, scientific data in the field of metrology plays a crucial role in the development of foundational data standards, with its security primarily reflected in the trustworthiness of the data. Only reliable metrological scientific data can be used to establish standard reference datasets and effectively support the national measurement standards through secure and trustworthy data sharing (Xiong et al., 2021).

Against this background, the concept of trustworthy detection of metrological data has emerged. Although there is currently no unified definition of data trustworthiness in the field of metrology, by integrating research findings from metrology, information technology, and data science (Lazzari et al., 2017; Velychko & Gordiyenko, 2019; Santos, 2020), it can be broadly understood as the use of technical means to verify and ensure the integrity and trustworthiness of metrological data throughout its entire lifecycle. This process involves verifying data sources, recording data changes, and performing integrity checks during data transmission (Kuster, 2021). The core objective of trustworthy detection is to ensure that any abnormal intervention during data acquisition, transmission, or storage can be promptly identified, with the system providing a clear and traceable audit trail (Liu et al., 2024).

It is worth noting that in the field of judicial forensics, a range of digital trace detection technologies has been proposed to meet data traceability requirements, and these technologies have gradually become key analytical tools in digital forensics (Horsman & Errickson, 2019). The core technologies aim to reveal traces of data generation, modification, and access behaviors, thereby providing a basis for assessing data trustworthiness and determining accountability. This concept aligns closely with the objectives of trustworthy detection in metrology. However, despite the logical and practical similarities between the two, the application of digital trace detection technologies in the field of metrology remains limited (Bianchi, Giannetto & Careri, 2018; Samoilenko & Tsiporenko, 2023).

Therefore, this article presents a systematic review of data trace detection technologies, summarizing the current state of research in this field. Building on existing studies of tamper detection, this article aims to provide valuable references for both data science researchers and metrology experts, while also identifying key research priorities and future development directions for trustworthy detection of metrological data. The main contributions of this article include:

Key challenges: Summarize the key challenges encountered in the practical application of metrology and the main obstacles faced in the trustworthy detection of metrological data.
Analysis of existing research: An overview of the current advancements in data trace technologies is provided, establishing a foundational understanding of data traces and their role in trustworthy data detection.
Detailed review and classification: An in-depth analysis is conducted on the formation and evolution of data traces at various stages of data storage. Relevant literature is categorized based on different stages of the storage process, presenting the implementation methods of tampering detection technologies at each stage to offer a comprehensive understanding of the application characteristics of data trace technologies.
Future directions: The potential applications of data trace technologies in the trustworthy detection of metrological data are explored, providing guidance for the development of more robust systems for data protection and tampering detection in the metrology domain.

“Key Challenges” presents the major challenges faced in trustworthy detection in metrology; “Related Conceptual Knowledge” introduces the conceptual knowledge related to data traces; “Methodology” reviews the various stages of the data storage process; “Analysis of Existing Research” analyzes the research and technological advancements in data traces at different stages; “Limitations and Future Directions” focuses on the limitations and application prospects of data trace technology in the metrology field; “Conclusion” provides a summary of the article.

Key challenges

Although digital measurement methods offer significant advantages in terms of speed and accuracy compared to traditional methods, these advantages are facing serious challenges due to the rapid increase in the volume and variety of metrological data and the emergence of deep forgery technologies (Hasan et al., 2022). It has become increasingly difficult to ensure that digital data remains unaltered and complete, which not only directly affects the accuracy of metrological data but may also lead to severe economic losses. For instance, common metering instruments such as fuel dispensers (Melo et al., 2021; Almeida, Oliveira & Melo, 2024) and electric meters (Jacob & Samuel, 2020) are often targeted for illegal tampering.

Currently, the primary technical approaches for ensuring the trustworthiness of metrological data include digital signatures (Neyezhmakov, Zub & Pivnenko, 2021, 2022; Mustapää et al., 2022), hash function verification (Gregório, Miller & Duncan, 2024), and blockchain technology (Melo et al., 2021; Takegawa & Furuichi, 2023). These methods aim to guarantee the integrity and authenticity of key elements such as original file information, signature parameters, and timestamp data. Among them, the application of these technologies to trusted timestamping has become a commonly adopted method in the field of metrology for achieving data reliability and traceability (Matsakis, Levine & Lombardi, 2018; Melnik, Petrova & Vikulin, 2023).

Metrological scientific data typically contain a timestamp field that represents the time of the measurement activity during which the data were generated. This field serves both as metadata and as a trusted element for data traceability. Its reliability can be ensured through a well-structured and continuous time traceability chain. Trusted timestamping technologies are generally implemented based on high-precision time synchronization mechanisms and cryptographic signature algorithms, emphasizing the measurability and traceability of time information (Kumar et al., 2022; Wendt, 2025). These technologies ensure that the recorded time is anchored to a clear physical time reference and carries legal validity. Current research is advancing the integration of high-precision time sources into timestamping systems to further enhance their accuracy and verifiability.

In addition, some studies are exploring the integration of timestamping mechanisms with blockchain technology (Milicevic et al., 2022), leveraging distributed ledger structures to record the generation and verification processes of timestamps. This approach further enhances anti-tampering capabilities and operational transparency, providing critical support for the development of a trustworthy and highly reliable metrological data assurance system.

Although the aforementioned technologies have shown initial success in tamper prevention and data traceability during the data acquisition stage, a systematic, end-to-end trustworthy assurance framework for metrological data has yet to be established. In particular, there is still a lack of effective technical mechanisms to support post-tampering behavior detection, tampering path analysis, and accountability tracing. This shortcoming has become a critical obstacle hindering the further development of a trusted metrology system.

Related conceptual knowledge

Digital forensics

Digital forensics (DF), originating from forensic computer science, aims to encompass all digital devices that can be investigated. The National Institute of Standards and Technology (NIST) defines DF as “the scientific application of data identification, collection, examination, and analysis, while ensuring the integrity of information and maintaining a strict chain of custody” (Lyle et al., 2022). Due to its multidisciplinary and interdisciplinary nature, DF can be widely extended to other fields. For example, devices used in metrology are also potential sources of digital evidence (Irons, 2010).

In modern digital forensics investigations (see Fig. 1), the core component of ensuring data security is the “Ensure and Detect” phase. The main function of this phase is to help determine whether unauthorized operations have occurred and to ensure data security through relevant technologies and tools. In this process, the key factor in identifying and tracking data anomalies is data trace.

Figure 1: Steps of digital forensics investigation.

Download full-size image

DOI: 10.7717/peerj-cs.3106/fig-1

Data trace

The concept of data traces originates from the interpretation of physical tool marks in forensic science. Similar to the marks left by physical tools on the surface of objects (Kaur, Sharma & Sharma, 2024), data also leaves distinct traces during storage and processing. As Kessler (2007) demonstrates, “most processes leave identifiable erasure traces,” indicating that when unauthorized users attempt to manipulate or tamper with data through computer system operations, these actions are likely to leave detectable traces.

Data traces typically encompass various types of operation-related information, such as operating system usage files, specific tool footprints, system-generated logs, and file system metadata. This information plays a critical role in recording and reflecting operational activities. Even when certain tools attempt to overwrite or conceal traces of operations, residual logs, caches, or hidden metadata within the system may still retain partial data traces, thereby providing key evidence for digital trust verification and provenance analysis. These traces not only reveal the time, location, and nature of operations but also potentially uncover details of tampered or deleted data, thereby supporting the detection of data trustworthiness and integrity breaches. Furthermore, they assist digital investigators in tracking the evolution of data, reconstructing timelines of data operations, and analyzing the causes of data loss, thus offering significant value.

Anti-forensics

Anti-forensics (AF) is a frequently mentioned term in the field of DF, referring to various techniques aimed at weakening, concealing, tampering with, or deleting digital evidence (González Arias et al., 2024). The primary objective of these techniques is to disrupt, obscure, or fabricate data traces, thereby hindering the process and accuracy of DF investigations. Although no formal and unified definition currently exists, most practitioners and scholars generally agree that any attempt to alter, interfere with, negate, or obstruct scientifically valid forensic investigations through technical means can be classified as AF techniques (Mohammad et al., 2024).

For example, timestamp forgery is a common AF technique that involves modifying a file’s creation, modification, or access times in order to confuse investigators and complicate or invalidate timeline analysis (Oh, Lee & Hwang, 2024). Common tools for timestamp forgery are shown in Table 1. In practice, the impact of AF techniques extends beyond the field of DF. By compromising data trustworthiness and traceability, these techniques pose a threat to any field that relies on high data credibility.

Table 1:

File system timestamp tampering tools.

Forgery methods	Forgery tools
API function (second-level precision)	Dat FileTouch, NewFileTime, SKTimeStamp, Bulk FileChanger, Change Timestamp, and eXpress TimeStamp Toucher
API function (nanosecond-level precision)	Timestamp, nTimestamp
Disk access	SetMace¹

DOI: 10.7717/peerj-cs.3106/table-1

Note:

1 Starting from version 1.0.0.6, SetMACE adopts direct disk access.

In practical applications, the forms and technical methods of AF can be diverse and are not limited to digital tools or methods specifically designed for this purpose, exhibiting increasingly complex characteristics. In fact, certain techniques and actions not originally intended for AF can inadvertently impact data integrity and trustworthiness (Palmbach & Breitinger, 2020; Yaacoub et al., 2021), especially in areas related to privacy protection and system optimization. For example, disk defragmentation or automatic cleanup tools are designed to improve system performance or protect user privacy, yet these technologies may delete or overwrite data valuable to digital investigations. Although these tools lack explicit AF intent, they can still hinder digital forensic efforts.

The key factor in distinguishing between legitimate actions and malicious AF activities often lies in the user’s subjective intent. Many privacy-enhancing technologies, such as encryption, steganography, and anonymous communication, play a significant role in protecting user privacy. However, when maliciously exploited, these technologies can severely hinder the verification of data trustworthiness, thereby further complicating the field of data security.

Data storage process

After years of development, DF has been widely applied in various fields. In the context of metrology, to effectively utilize DF, it is essential to focus on the aspects most pertinent to metrologists—the data transmission and storage processes in metrological devices. This focus serves as the basis for the classification in our current study.

Currently, to ensure data security, the vast majority of metrological data storage still relies on physical storage. The general process of data storage is illustrated in Fig. 2: users interact with applications through the operating system’s user interface, and the input data is then transmitted from the application to the operating system. Within the operating system, data is managed by the file system and ultimately passed to the storage system. The operating system, in conjunction with the host controller and storage controller, writes data to the physical storage.

Figure 2: The data storage process.

Download full-size image

DOI: 10.7717/peerj-cs.3106/fig-2

During this process, the data undergoes multiple transformations, transitioning from a high-level, user-understandable format to a binary format suitable for physical storage, as shown in Fig. 3. Clearly, due to the inherent instability of computer data, any operation can lead to changes in the metadata. Furthermore, each stage of the data processing flow exhibits distinct characteristics and operational methods, which results in different focal points for data trace analysis at different stages.

Figure 3: The transformation of data forms.

Download full-size image

DOI: 10.7717/peerj-cs.3106/fig-3

Methodology

To comprehensively organize and summarize the research efforts in the field of data traceability, this article outlines the following research plan:

Literature review by stages: The types of data traces generated at different stages of the data storage process vary, as shown in Table 2. Therefore, to better understand the different data trace detection techniques at each stage, this study analyzes relevant research and technologies in stages based on the data storage process, providing a detailed discussion of the key issues and research advancements related to data tracking at each stage.
Comprehensive literature search: A comprehensive literature review was conducted on research related to data trace detection in the field of digital forensics, encompassing several authoritative databases, including SpringerLink, ScienceDirect, IEEE Xplore, Google Scholar, and Web of Science, to ensure a broad and diverse range of resources. In addition, we manually examined the reference lists of the selected studies and analyzed their citation records to identify potential relevant research.
Rigorous selection criteria: To ensure the quality and rigor of this review, the retrieved studies were subjected to detailed screening and quality evaluation. Studies unrelated to the topic of data traceability and those published in low-quality journals were excluded. Priority was given to research published in high-impact journals and top-tier conferences to guarantee the authority and academic value of the review.
Information extraction from selected studies: Key information, including authorship, publication year, research type, and major contributions, was extracted and organized into tables or visual formats to facilitate comparisons and analyses for scholars in the metrology field engaging in interdisciplinary studies.

Table 2:

Data traces across different stages.

Stage	Data traces
Application layer	Multimedia features, application logs, configuration files
Operating system	OS logs, process traces, network activity, system calls, error reports
File system	File allocation table, metadata (timestamps, permissions, owner, size), file paths
Storage device	Disk fragments, memory caches, virtual memory, deleted files

DOI: 10.7717/peerj-cs.3106/table-2

Based on the aforementioned methodology, this study conducted a literature search using keywords related to digital evidence, anti-forensics, and digital forensics, in combination with terms related to “data traces” (such as detection, traces, and tampering). A total of 1,937 articles published in the field of data forensics over the past five years were identified, among which 202 articles contained terms related to data traces. Subsequently, these 202 articles were screened, eliminating irrelevant and duplicate articles, and citation records were further analyzed to identify additional relevant studies. To ensure comprehensiveness, articles that were highly relevant but not captured through automated searches were manually included. In the end, 81 articles were selected as the analysis objects for this study, and the keyword frequency distribution of the selected articles is shown in Fig. 4.

Figure 4: Word cloud of keywords from related literature.

Download full-size image

DOI: 10.7717/peerj-cs.3106/fig-4

Figure 5 illustrates the year distribution of the selected articles, indicating that most of the publications were published within the last five years. Based on the content of the research, this study categorizes the literature into four categories: application layer, operating system, file system, and storage devices. Figure 6 presents the distribution of the number of articles in these four categories, showing that the output of articles across these categories is relatively balanced, with each category having a substantial number of articles published in the past five years.

Figure 5: Year distribution of literature on data traces.

Download full-size image

DOI: 10.7717/peerj-cs.3106/fig-5

Figure 6: Distribution of literature across research categories.

Download full-size image

DOI: 10.7717/peerj-cs.3106/fig-6

Analysis of existing research

Application layer trace detection

Prior knowledge

The application layer is an advanced stage of data processing, primarily responsible for high-level data management and operations. At this stage, the data has typically undergone initial encoding, decoding, and transformation, making it meaningful and usable. Multimedia data, such as images, audio, and video, are common data types used at this stage. For example, in the field of metrology, digital calibration certificates (Neyezhmakov, Zub & Pivnenko, 2021) and industrial measurement inspection images (Lins, Santos & Gaspar, 2023) are part of multimedia data.

Existing multimedia data tampering operations typically include deleting, hiding, or replacing specific regions, adding new elements, and adjusting the size or position of existing elements. Although these tampering operations may leave no visible or audible traces through anti-forensic techniques, the data’s underlying structure, encoding methods, and compression characteristics often still reflect traces of modification. Through specific detection techniques, these traces can generally be identified and restored, revealing whether the data has been tampered with or forged (Farid, 2016).

Related research

Currently, the majority of research focuses on multimedia data tampering detection, covering areas such as text, images, audio, and video. Additionally, some studies are dedicated to the detection of anti-forensic attacks. The literature related to application-layer data traces is summarized in Table 3.

Table 3:

Trace monitoring related literature at the application layer.

Related research	Research content	Contribution
Bourouis et al. (2020)	Multimedia	Reviewing the latest advancements in digital multimedia tampering detection while proposing an intuitive soft classification method to enhance the reliability of digital multimedia authentication.
Wang et al. (2022)	Multimedia	Reviewed current media tampering detection methods, including image tampering detection and Deepfake detection, and discussed the challenges and future research trends.
Zhang, Wang & Zhou (2018)	Video	Summarized the current research progress in video forgery detection technologies, covering various active and passive detection methods, and suggested future research directions.
Shelke & Kasana (2021)	Video	Conducted a comprehensive survey on passive techniques for digital video forgery detection, analyzing existing methods, discussing anti-forensic strategies and deepfake video detection, and proposed a standardized benchmark dataset and a general framework for digital video forgery detection.
Akhtar et al. (2022)	Video	Comprehensively reviewing existing passive video tampering detection techniques, analyzing the strengths and weaknesses of current research, and discussing the limitations of video forensics algorithms along with future research directions.
Li & Huo (2024)	Video	A video frame deletion detection method based on continuity-attenuation features is proposed, capturing spatiotemporal characteristics of forged videos through a deep learning network architecture. Experimental results show a detection rate of 93.85%.
Shaikh & Kannaiah (2025)	Video	A hybrid approach combining traditional block-based analysis and convolutional neural networks (CNNs) for video forgery detection, effectively addressing advanced forgery techniques. The method achieved an accuracy of 79.31% and an F1-score of 65.87, outperforming existing methods.
da Costa et al. (2020)	Image	Reviewed the current state of research in image tampering and anomaly detection, identified future opportunities and challenges, and provided an innovative contribution through in-depth analysis and comprehensive evaluation of over 100 related studies.
Padilha et al. (2022)	Image	Proposed a content-aware method for detecting timestamp manipulation by verifying the consistency between image content, capture time, and geographic location. The approach significantly improved classification accuracy on a large benchmark dataset and can estimate capture time when the timestamp is missing.
Bharathiraja et al. (2024)	Image	A comprehensive survey on image forgery techniques and detection methods, discussing source camera identification, type-dependent and independent methods, and deep learning approaches. Aims to enhance understanding of image forgery detection and demonstrate cross-domain applications.
Kaur, Jindal & Singh (2024)	Image	A review of passive image forgery detection techniques, discussing common forgeries like splicing, copy-move, and retouching, along with frameworks, taxonomy, and methodologies. Aims to assist researchers in overcoming challenges and offers future directions.
Capasso, Cattaneo & De Marsico (2024)	Image	A survey on image integrity detection methods, discussing common forgery attacks and countering techniques, evaluating the limitations of existing methods in detecting counterfeit images. Highlights the potential of combining approaches, particularly convolutional neural networks, for forgery detection.
Johnson & Davies (2020)	Text	Used digital forensic techniques to analyze a contract cheating case, revealing inconsistent editing patterns through revision identifiers in Microsoft Word documents. Developed a tool using visualization techniques to identify fraudulent editing traces, aiding in the detection of contract cheating.
Johnson, Davies & Reddy (2022)	Text	Explored the application of digital forensics techniques in higher education for detecting academic misconduct, proposing the repurposing of tools used in criminal cases for file tampering and metadata extraction to detect plagiarism and contract cheating. Developed a prototype software tool for automated plagiarism/contract cheating detection.
Joun, Lee & Park (2023a)	Text	Proposed a novel framework for precise tracing of deleted or overwritten data through data remnants analysis, overcoming the limitations of traditional digital forensics methods. Case study on Microsoft 365 demonstrated superior efficacy and accuracy of the approach.
Spennemann, Spennemann & Singh (2024)	Text	Examined how RSID numbers in Microsoft Word can be used for detecting academic misconduct. Explored how common editing actions like copy, paste, and text insertion leave traces in the rsid encoding, and provided prompts to guide investigators in interviewing alleged perpetrators.
Leonzio et al. (2023)	Audio	Proposed a method for detecting and localizing audio splicing based on CNN, using model-specific features to identify if an audio track has been spliced. Manipulations are detected through clustering, and splicing points are located using a distance-measuring technique.
Son & Park (2024)	Audio	Proposed a deep learning-based approach for detecting forged audio files, specifically targeting modifications made using the “Mixed Paste” command. The method combines a convolutional neural network and a long short-term memory (LSTM), utilizing Korean consonant features from spectrograms, achieving a high accuracy rate of 97.5%.
Ustubioglu et al. (2025)	Audio	Proposed a novel audio copy-move forgery detection method, identifying forged segments by extracting high-resolution spectrograms and matching keypoints, combined with graph-based representations and CNN classification, achieving superior detection accuracy. The method performs well across various attack scenarios, demonstrating strong generalization potential.
Rekhis & Boudriga (2011)	AF	Proposed a theoretical approach to digital forensic investigations that considers anti-forensic attacks, addressing the challenge of attackers obstructing event analysis and attack scenario reconstruction through such attacks.
Rekhis & Boudriga (2012)	AF	Proposed a novel hierarchical visibility theory for digital investigations to counter anti-forensic attacks, enabling the handling of security events in complex systems and improving the provability of attacks.

DOI: 10.7717/peerj-cs.3106/table-3

In summary, current application-level data trace detection methods can be classified into two main categories: active detection and passive detection, as shown in Fig. 7. Active detection methods rely on auxiliary information, such as digital watermarks, digital signatures, and revision save identifiers (RSID), to ensure the authenticity and integrity of the data. These methods embed additional markers or information into the data, enabling timely detection in the event of tampering. However, a limitation of active detection methods is that they require pre-embedded anti-tampering mechanisms, and in some cases, an attacker may be able to remove or modify these identifying markers. In recent years, blockchain technology has received increasing attention in the field of active detection. Its decentralized and immutable features effectively enhance data security (Sakshi, Malik & Sharma, 2024; Ramezanzadehmoghadam, 2024); however, since this study focuses on detecting traces of tampered data, blockchain technology is not the primary focus of this article.

Figure 7: Classification of multimedia tampering detection.

Download full-size image

DOI: 10.7717/peerj-cs.3106/fig-7

In contrast, passive detection methods do not rely on any pre-embedded information, but rather analyze the inherent characteristics of the data itself. These methods detect tampering by exploiting inevitable inconsistencies during the tampering process, such as pixel-level anomalies, geometric distortions, physical lighting inconsistencies, and compression artifacts. Since passive detection methods do not require special processing during data generation, they offer greater flexibility and are less susceptible to tampering by attackers. However, the detection accuracy of these methods typically depends on the quality of the data. When facing issues such as blurred data edges, compression, or noise, the performance of the technology may degrade, leading to a higher false positive rate.

Researchers are actively exploring the use of emerging technologies, such as deep learning (Padilha et al., 2022; Leonzio et al., 2023; Shaikh & Kannaiah, 2025), to enhance detection capabilities. Although deep learning methods have made significant progress in certain applications, the existing methods typically use different evaluation metrics and datasets, making comparisons between them difficult. Additionally, while improving detection accuracy helps identify more tampered behavior, it often increases computation time and resource consumption, which presents a significant challenge in current research. Therefore, balancing detection accuracy, time efficiency, and resource consumption remains a critical issue that future application-level research must address.

Operating system trace detection

Prior knowledge

Operating systems generate and manage various metadata during data transmission, typically including file operation records, user activities, and system states. However, different operating systems exhibit significant differences in data processing, recording methods, and log management strategies. The major operating systems, as shown in Fig. 8, can generally be divided into two branches: one based on Microsoft Windows and the other derived from Unix. The Windows NT series features a complex event logging system that comprehensively records file operations, user logins, and other activities. In contrast, Unix/Linux systems typically rely on syslog to log system behaviors, resulting in a narrower scope of logs compared to Windows systems. macOS combines a Unix-style logging system with its own activity monitoring tools, enabling the capture of a broad range of system and user activity information. Therefore, understanding the data processing methods and log recording mechanisms of different operating systems is crucial for effective digital trace detection.

Figure 8: Classification of operating systems.

Download full-size image

DOI: 10.7717/peerj-cs.3106/fig-8

Related research

In recent years, researchers have conducted extensive studies on trace detection at the operating system level, covering most mainstream operating systems. Table 4 summarizes the literature related to operating system data traces.

Table 4:

Trace monitoring related literature at the operating system layer.

Related research	Research content	Contribution
Lees (2013)	Windows	Explored methods for detecting the deletion of digital evidence using Update Sequence Number (USN) log files, identifying recognizable patterns left by specific software usage and ‘private browsing’ modes.
Singh & Gupta (2019)	Windows	Studied file timestamp patterns generated when using different operating system tools to manipulate files in the Windows Subsystem for Linux environment on Windows 10, aiming to detect timestamp forgery.
Studiawan, Sohel & Payne (2019)	Windows	Provided a comprehensive overview of forensic analysis techniques, tools, and datasets for operating system logs, and proposed potential directions for future research.
Dija et al. (2020)	Windows	Proposed a detailed process for reconstructing or retrieving key information related to suspicious programs from multiple sources in Windows systems, even if the programs have been deleted or overwritten.
Joo, Lee & Jeong (2023)	Windows	Explored the impact of file-wiping tools on different Windows artifacts, analyzed traces left by 10 file-wiping tools in 13 Windows artifacts, and created a reference database to assist forensic investigators in identifying the use of file-wiping tools.
Vanini et al. (2024)	Windows	Introduced the concept of time anchors to validate and correct system clock skew, addressing timestamp inaccuracies caused by system clock deviations. Demonstrated the use and benefits of time anchors in digital forensic event reconstruction through specific case examples.
Todd & Peterson (2024)	Windows	Introduced digital trace inspector (DTI), a learning classifier system (LCS)-based tool for temporal metadata analysis. The tool groups temporal digital traces of targeted user activity using a binary Michigan-style LCS and constructs rules through an expert knowledge framework. Experimental results show perfect recall across 10 user behavior scenarios, with an average F1-score of 0.98 and minimal training data required.
Garland et al. (2024)	Windows Linux	Investigated digital forensic artifacts related to 3D printing slicing software, analyzing data stored in Windows and Linux operating systems. By collecting RAM, configuration data, generated files, residual data, and network data, vital evidence was identified for 3D printing forensics.
Fiadufe et al. (2025)	Windows	Investigated malware forensics in Windows 7 OS, using memory and disk forensics tools to analyze malicious software through static and dynamic analysis. Proposed a comprehensive malware forensics framework to enhance malware detection and analysis.
Thierry & Müller (2022)	Unix	Proposed a comprehensive approach to deeply analyze MACB timestamps in Unix-like systems, revealed the influence of different layers of the software stack, and introduced a framework for automatic analysis of operating system kernels, user-space libraries, and applications, while identifying and documenting various unexpected and non-POSIX-compliant behaviors.
Govindaraj et al. (2015)	IOS	Introduced the iSecureRing solution, which provides security for mobile applications on jailbroken iPhones and preserves event timestamps to support forensic analysis.
Joun, Lee & Park (2023b)	macOS	Proposed a method to track deleted files by analyzing various sources of file-related metadata in macOS systems, addressing the issue of spoliation of evidence, particularly in cases of information leakage, digital sexual crimes, accounting fraud, and copyright infringement.
Pieterse, Olivier & Van Heerden (2015)	Android	Proposed a new method called the Authenticity Framework for Android Timestamps (AFAT) to verify the authenticity of SQLite database timestamps in Android smartphones.
Pieterse, Olivier & Van Heerden (2016)	Android	Proposed a reference architecture for detecting tampering evidence in Android applications, focusing on identifying evidence manipulation through the analysis of timestamps in SQLite databases.
Pieterse, Olivier & Van Heerden (2017)	Android IOS	Proposed a novel reference architecture for smartphone applications and derived seven theories of normal behavior from this architecture to evaluate the authenticity of smartphone evidence.
Pieterse, Olivier & Van Heerden (2019)	Android IOS	Investigated the impact of manipulating smartphone data on Android and iOS platforms and proposed an evaluation framework for detecting tampered data, successfully demonstrating data manipulation behaviors and providing preliminary evidence that the framework aids in detecting tampered smartphone data.
Han & Lee (2022)	Android	Studied the handling of timestamps for multimedia files on mobile devices, examining how operating systems and filesystems manage various types of timestamps and the potential impact of timestamp reversal on digital forensics.
Park et al. (2022)	Android	Proposed a forgery analysis method for Samsung smartphone call recordings, identifying tampering by analyzing audio bandwidth, latency, file structure, media-log, and call history to verify the authenticity of the recordings.
Rathore et al. (2023)	Android	Investigated adversarial attacks and defenses in Android malware detection, introducing two reinforcement learning-based evasion attacks, analyzing the vulnerabilities of 36 detection models, and proposing the MalVPatch defense strategy to enhance detection accuracy and adversarial robustness.
Valeti & Rathore (2024)	Android	Introduced a malware detection model based on Android permissions and intents, presented a novel adversarial attack method GBKPA, analyzed the transferability of adversarial samples, and proposed the AuxShield defense mechanism, significantly enhancing the adversarial robustness of detection models.

DOI: 10.7717/peerj-cs.3106/table-4

In summary, trace detection at the operating system level encompasses multiple mainstream operating systems, with a primary focus on tampering detection in system logs and file metadata, particularly addressing core issues such as data deletion, timestamp forgery, and malware forensics. For instance, the temporal metadata analysis method proposed by Todd & Peterson (2024) demonstrated exceptionally high recall and accurate recognition of user behavior scenarios, particularly with significant breakthroughs in reducing the need for training data, which is crucial for enhancing forensic efficiency and accuracy. Additionally, the timestamp validation framework introduced by Pieterse, Olivier & Van Heerden (2015, 2016, 2017, 2019) for Android and iOS platforms greatly improved the detection of data tampering in mobile devices, showing strong adaptability and accuracy when dealing with timestamp forgery in databases.

However, despite these advancements, there remains a notable bias in the research on operating system-level data trace detection. Most of the research focuses on Windows and Android systems, while studies on Unix-like systems and iOS (particularly due to the constraints of closed ecosystems) are relatively scarce. Although some scholars (Garland et al., 2024) have made positive strides in cross-platform forensics, the challenges posed by the forensic limitations of closed ecosystems and the fragmentation of open systems still result in insufficient generalization and cross-platform compatibility of current technologies. Therefore, future research should place greater emphasis on improving the general applicability of existing technologies, particularly in both open and closed ecosystem environments, to address compatibility issues across different operating systems.

File system trace detection

Prior knowledge

The file system records various metadata associated with files, documenting the entire history of file operations, from creation and modification to deletion, and even enabling the traceability of a file’s origin and content changes. As such, it is a crucial source of data traces. Currently, there is a wide variety of file systems used across different fields, and for the same operating system, multiple file systems are typically available for selection, as detailed in Table 5. Different types of file systems generate varying amounts and types of metadata and utilize their own unique mechanisms to record and manage this metadata. In short, the complexity of the file system’s design directly affects the richness and accuracy of its metadata.

Table 5:

Common operating systems and supported file systems.

Operating system	File system
Windows	FAT12, FAT16, FAT32, exFAT, NTFS, ReFS
Mac	APFS, HFS, HFS+, FAT32, ExFAT, UDF
Linux	Ext, Ext2, Ext3, Ext4, Btrfs, XFS, JFS, GFS, VFAT, HPFS, Swap
FreeBSD	UFS, ZFS, FAT32, ext2, ext3, ext4
Android	FAT32, Ext3, Ext4, exFAT, F2FS
iOS	APFS, HFS+, FAT32
Optical media	ISO 9660, UDF (Universal Disk Format)
Embedded systems	FAT12, FAT16, FAT32, ext2, ext3, JFFS2

DOI: 10.7717/peerj-cs.3106/table-5

For example, the NTFS file system (Fig. 9), compared to the FAT file system (Fig. 10), records more detailed operational information and logs, thus generating richer metadata. The potential data traces produced by these file systems provide critical support for trace detection technologies, enabling researchers to more easily track the full lifecycle of data and effectively identify traces of file tampering and their origins (Buchholz & Spafford, 2004).

Figure 9: NTFS file system architecture.

Download full-size image

DOI: 10.7717/peerj-cs.3106/fig-9

Figure 10: FAT file system architecture.

Download full-size image

DOI: 10.7717/peerj-cs.3106/fig-10

Related research

Table 6 summarizes the key research findings related to file system trace detection.

Table 6:

Trace monitoring related literature at the file system layer.

Related research	Research content	Contribution
Chow et al. (2007)	NTFS	Focused on temporal analysis of the NTFS file system and proposed intuitive rules for digital file behavior characteristics to aid event reconstruction in computer forensics.
Neuner et al. (2016)	NTFS	Proposed and studied the feasibility of using file timestamps as a steganographic channel, leveraging the information gap between storage and use of timestamps with high-precision timers in modern operating systems, designing a layered steganographic system with secrecy, robustness, and wide applicability.
Neuner et al. (2017)	NTFS	Evaluated the steganographic potential of redundant capacity in file system timestamps for information hiding and data exfiltration, and proposed technical methods to assist digital forensic investigations in identifying and detecting tampered file system timestamps.
Nordvik, Toolan & Axelsson (2019)	NTFS	Proposed a new method leveraging the $ObjId Index in the NTFS file system to record and correlate user activities, providing a novel tool for storage device forensics.
Bahjat & Jones (2019)	NTFS	Proposed a framework for determining the time window of deleted file fragments by analyzing the timestamps of allocated neighbors, innovatively leveraging the known temporal states of adjacent clusters to infer the creation and deletion times of file fragments.
Mohamed & Khalid (2019)	NTFS	Proposed a machine learning-based method for the automatic detection of NTFS file system timestamp tampering, improving detection efficiency and accuracy.
Palmbach & Breitinger (2020)	NTFS	Proposed the use of four existing Windows system log files to detect timestamp tampering in the NTFS file system and analyzed the reliability of this evidence.
Oh, Park & Kim (2020)	NTFS	Proposed an anti-anti-forensics method based on NTFS transaction characteristics and machine learning algorithms, capable of detecting data wiping traces, effectively identifying wiped files and the tools used, providing evidence for digital forensic investigations.
Kao (2020)	NTFS	Explored timestamp transfer and contact artifacts in data hiding operations, proposing a forensic exchange analysis method based on temporal attributes, with practical applications in crime scene reconstruction. Experimental results demonstrate the method’s effectiveness in uncovering hidden relationships and tracing forensic evidence.
Galhuber & Luh (2021)	NTFS	Investigated file timestamp patterns generated by user operations in the NTFS file system to address timeline forgery techniques commonly used in anti-forensics, and evaluated the effectiveness of seven third-party timestamp forgery tools along with the forensic evidence they produce.
Horsman (2021)	NTFS FAT32	Studied eight free file wiping tools and identified the “Digital Tool Marks” (DTMs) left behind after using these tools, exploring whether these traces can be linked to specific wiping tools to enhance digital forensic analysis.
Bouma et al. (2023)	NTFS	Proposed a method for reconstructing file history based on NTFS timestamps, innovatively inferring all possible file operation sequences and generating a tree-structured timeline of file history for digital forensics.
Song & Lee (2023)	NTFS	Designed a timestamp manipulation detection method based on storage performance in NTFS, which verifies file write time and compares it with storage performance to detect timestamp tampering. This method overcomes the limitations of traditional detection techniques, preventing precise timestamp manipulation.
Oh, Lee & Hwang (2024)	NTFS	Compared and analyzed existing NTFS timestamp manipulation detection methods, proposing a new detection algorithm based on NTFS journals to overcome the limitations of previous methods. Experimental results show significant improvements in detecting timestamp manipulations and identifying normal events.
Lee & Song (2024)	NTFS	Proposed a method for detecting abnormal Windows file paths based on natural language processing, converting file paths into vectors and classifying them using machine learning. Experimental results demonstrated an accuracy of up to 94% in detecting abnormal file paths.
Nordvik et al. (2021)	Compre- hensive	Proposed a formal reliability validation procedure for file system reverse engineering, focusing on documenting the forensic process and tools to ensure reliability and reproducibility. Highlighted the inadequacy of current validation methods in meeting legal and scientific requirements.
Heeger, Yannikos & Steinebach (2021)	exFAT	Proposed two methods for hiding data within the exFAT file system, with exHide offering higher robustness and a reasonable embedding rate. The study evaluates both methods and discusses their strengths and weaknesses.
Nordvik & Axelsson (2022)	exFAT	Investigated the handling of timestamps in the exFAT file system, revealing significant inconsistencies between Windows, MacOS, and Linux implementations, as well as discrepancies in forensic tools’ interpretation. Concluded that timestamp interpretation in criminal investigations should not assume flawless adherence to file system specifications or accurate interpretation by forensic tools.
Lee et al. (2023)	ReFS	Proposed a method for detecting data wiping tools based on deletion patterns in the ReFS $Logfile, identifying specific deletion behaviors through opcode variations. Demonstrated the application and effectiveness of this method in digital forensics.
Toolan & Humphries (2025)	XFS	Examined five methods of data hiding in the XFS file system, evaluating them based on capacity, detection difficulty, stability, and other metrics. Highlighted XFS as a viable option for data hiding.

DOI: 10.7717/peerj-cs.3106/table-6

In the field of file system-level trace detection, timestamp analysis occupies a central position. Current research has gradually evolved from traditional temporal rule modeling to multi-dimensional detection techniques. However, experiments by Galhuber & Luh (2021) on seven types of third-party tampering tools indicate that existing detection methods still face the challenge of increasingly sophisticated falsification techniques. For example, attackers may exploit redundant fields in the lower layers of the file system to bypass traditional validation mechanisms, thereby evading detection.

Moreover, differences in file systems pose significant challenges to forensic reliability. Studies have shown that the characteristics of different file systems directly affect the effectiveness of detection. For example, the ambiguity in timestamp interpretation of exFAT across Windows, Linux, and macOS may lead to conflicting forensic conclusions across platforms (Nordvik & Axelsson, 2022).

Overall, current research faces three major bottlenecks: first, an overemphasis on NTFS, which results in the delayed development of detection methodologies for other file systems; second, detection algorithms are predominantly based on laboratory environments, lacking robustness validation in complex real-world scenarios; and third, there is a lack of evaluation of the trustworthiness of forensic tools themselves.

Storage device trace detection

Prior knowledge

Storage devices can be classified into two main categories: persistent storage and volatile storage (see Fig. 11). Persistent storage includes hard drives (HDD), solid-state drives (SSD), and flash memory, all of which are characterized by their ability to retain data for long periods even after power loss, through physical or electronic media. For example, an HDD relies on magnetic platters to store data, and data can sometimes be recovered even after the device has been formatted. In contrast, an SSD uses different storage and deletion mechanisms, and its data storage and erasure processes are unique. In some cases, an SSD may even support permanent data deletion, a feature that makes it easier for criminals to tamper with or delete data traces (Kumar, 2021). Additionally, there is no straightforward method to effectively prevent the recovery of data traces after an SSD has been erased.

Figure 11: Classification of storage devices: persistent vs. volatile storage.

Download full-size image

DOI: 10.7717/peerj-cs.3106/fig-11

In contrast to persistent storage, volatile storage refers to storage devices that lose data after power is removed, such as common DDR4 and DDR5 memory. Since volatile memory can only retain data while power is supplied, special care must be taken when collecting and analyzing data from volatile storage to avoid interfering with the contents of the memory during its operation. Trace analysis of volatile storage typically relies on memory dump techniques, which quickly extract data from memory to recover information such as program states and user activity logs (Nyholm et al., 2022). However, the dynamic nature of memory data and the loss of data upon power down make volatile storage trace analysis more challenging compared to persistent storage.

Related research

In recent years, research on storage device traces has mainly focused on memory acquisition techniques, and therefore, the studies summarized in Table 7 are primarily concentrated in the field of memory forensics.

Table 7:

Trace monitoring related literature at the storage device layer.

Related research	Research content	Contribution
Case & Richard (2016)	Memory	Focused on using memory forensic techniques to detect user-space malware written in Objective-C and proposed innovative methods for analyzing user-space malware.
Pagani, Fedorov & Balzarotti (2019)	Memory	Introduced the temporal dimension in memory forensics, exploring the impact of acquisition time on analysis results, validating the negative effect of temporal inconsistencies through experiments, and modifying existing tools to explicitly account for the temporal dimension and minimize its effects.
Schneider, Wolf & Freiling (2020)	Memory	Studied the difficulty of tampering with main memory images during digital forensics and the methods for detecting such tampering, revealing that while tampering with main memory images is challenging, it carries a significant probability of misleading analysts.
Parida & Das (2021)	Memory	Proposed a mechanism called PageDumper to collect page table manipulation information at runtime, capturing attack footprints related to memory protection manipulation and supplementing memory snapshot data for more practical and in-depth memory analysis.
Seo, Kim & Lee (2021)	Memory	Proposed a method to detect headerless executable files by scanning the Section table in memory dumps, addressing the issue of malware evading memory forensics by removing executable file headers, and implemented the method as a plugin in Volatility 3 framework.
Manna et al. (2022)	Memory	Developed a suite of memory analysis plugins targeting.NET and.NET Core applications, automatically identifying key runtime areas and suspicious malicious components, enhancing the detection of modern malware, especially memory-only attacks.
Arfeen et al. (2022)	Memory	Developed a framework for regular volatile memory acquisition to analyze individual process behavior, extracting key features and utilizing machine learning to efficiently classify malicious and benign processes, improving ransomware detection accuracy.
Bellizzi et al. (2022)	Memory	Proposed Just-in-time Memory Forensics (JIT-MF) to address the challenge of capturing short-lived evidence on Android devices, developed the MobFor tool to capture evidence of messaging hijack attacks, outperforming traditional tools in real-world attack scenarios.
Zhang et al. (2023)	Memory	Proposed a malware detection method based on convolutional neural networks and memory forensics, analyzing PE file fragments in memory, achieving high detection accuracy of 97.48%, particularly for fileless malware.
Khalid et al. (2023)	Memory	Proposed a machine learning-based detection method for fileless malware, extracting features from memory dumps using Volatility tool and analyzing them with machine learning algorithms. Experimental results show Random Forest achieved 93.33% accuracy, outperforming other classifiers.
Rzepka et al. (2024)	Memory	Evaluated causal inconsistencies in Windows 10 memory dumps using two methods, finding that a high level of inconsistencies is common, and proposed the possibility of controlling inconsistency levels by managing workload and the execution timing of memory acquisition tools.
Parida, Nath & Das (2024)	Memory	Proposed a mechanism named SAM to support wipe-aware legal analysis in non-atomically acquired volatile memory snapshots, enhancing the reliability and quality of memory analysis.
Kirmani & Banday (2024)	Memory	Conducted an in-depth analysis of the importance of reliability in physical memory forensics and proposed a comprehensive set of strategies and practical methods to enhance the reliability of memory forensics.
Penrose, Buchanan & Macfarlane (2015)	HDD	Proposed a fast contraband detection method using disk sampling and Bloom filters, achieving 99.9% accuracy in scanning large-capacity disks in minutes, addressing the challenge of forensic tools lagging behind increasing storage capacities.
Meffert, Baggili & Breitinger (2016)	HDD	Proposed an attack on digital forensic tools by modifying the firmware of a popular hardware write blocker (TD3), corrupting the integrity of the destination disk without the user’s knowledge, highlighting the need for security testing of forensic tools.
Bharadwaj & Singh (2018)	HDD	Proposed a region-based random sector sampling method to efficiently search for target data traces in large storage devices, quantifying the necessary percentage of random samples for effective examination.
Shey et al. (2018)	SSD	Proposed a non-invasive side-channel method using current measurements and machine learning to infer SSD TRIM operations with over 99% accuracy. Analyzed SSD current signatures and recommended a minimum sampling frequency for detecting TRIM operations, validating the current probe technique.
Jeong & Lee (2019)	HHD SSD USB	Introduced a new forensic signature for tracking storage devices, addressing limitations in HDD and SSD tracking. Introduced unidentified artifacts in UEFI firmware images independent of the operating system and developed a methodology for device tracking based on storage type.
Kumar (2021)	SSD	Examined challenges in SSD forensics, particularly the impact of TRIM function and background garbage collection on deleted data recovery. Discussed uncertainties in SSD data acquisition and provided guidelines and recommendations based on experimental analysis.
Chhajed (2024)	HDD SSD	Explored how different operating systems, file system types (e.g., ext4, NTFS, FAT), and storage devices (SSD and HDD) affect the persistence of deleted files. Aimed to improve digital forensics reliability by analyzing the impact of common user activities on the overwrite speed of deleted files..

DOI: 10.7717/peerj-cs.3106/table-7

In recent years, volatile device data detection has taken a dominant position in the research on trace detection at the storage device layer. This is primarily due to the high investigative value of the data stored in the system’s random access memory (RAM), as well as the rapid development of fileless malware. Fileless malware is more likely to evade traditional detection methods on hard drives and does not leave traces on the disk (Khalid et al., 2023). However, when these malicious programs’ characteristic information is stored in RAM, their hidden data traces can be extracted from memory in an unencrypted and easily accessible format. Therefore, related research is actively exploring the optimal methods for volatile memory detection.

However, these studies also reveal a significant technological imbalance, resulting in insufficient attention to the study of data trace characteristics in persistent storage devices. A possible reason for this is that the data in persistent storage devices is typically based on the structural data of the operating system and file system, making it more appropriate to conduct data trace detection research at the file system and operating system stages rather than at the storage device stage. Additionally, the time sensitivity and complexity of memory, as well as the ongoing debate regarding the credibility of memory snapshots, have long been challenges in memory data investigation. The rapid evolution of fileless malware also makes it difficult for existing memory forensics methods to adapt to new attack techniques. Therefore, although memory forensics holds significant potential in combating modern malware, the technology’s refinement and adaptability still require continuous testing and improvement in practical applications.

Limitations and future directions

Limitations

Although the metrological data trustworthiness detection system holds an optimistic view towards the integration of data traceability detection technology, the analysis in the previous section reveals certain limitations in the existing detection technologies. Metrological trustworthiness detection requires comprehensive verification across the entire “design-production-inspection” chain, involving the coordination and data transmission between metrological devices across different systems. However, current traceability detection technologies often struggle to achieve cross-platform correlation analysis, which limits their application in complex systems. Furthermore, real-time trust verification of dynamic data streams faces significant bottlenecks: existing data detection technologies are inadequate to meet the real-time detection requirements in certain metrological scenarios, where millions of data points are generated per second, resulting in verification gaps for critical data segments. Finally, although the focus of this study is on the local storage process, many data traceability techniques in fields such as network forensics (Mishra et al., 2021) and cloud forensics (Purnaye & Kulkarni, 2022) remain to be further explored.

Future directions

To facilitate the exploration of data trace detection technologies that contribute to trustworthiness verification in the field of metrology, future research can be conducted in the following directions:

Smart algorithm-driven trusted verification of measurement data: Measurement data is voluminous and structurally complex, containing various forms of data that form intricate tree-structured data networks. Traditional manual data detection techniques are no longer sufficient to meet the requirements of trusted measurement data verification. The development of automated anomaly detection models by integrating deep learning and artificial intelligence technologies will be an important research direction to enhance the accuracy and efficiency of detection.
Measurement device data tampering forensic technology: As malicious techniques for tampering with measurement device data become increasingly sophisticated, especially with embedded systems in measurement devices, their volatile storage devices hold significant value for data trace detection. Therefore, developing targeted data detection technologies will be a key pathway to addressing tampering issues in measurement devices related to public services, such as fuel dispensers and electric meters.
Multi-modal measurement data full-chain traceability framework: In trusted verification of measurement data, the traceability of data at a single stage is insufficient to establish the trustworthiness of the entire chain. Researchers need to explore collaborative analysis mechanisms for multiple types of information, integrating data content, file system metadata, system logs, and other multi-dimensional data traces, to construct a trustworthiness verification system covering the entire lifecycle of data generation, transmission, and storage.

Conclusion

The integration of data trace analysis into metrology represents a key advancement in addressing the evolving challenges of data trustworthiness in the era of digital transformation. This study highlights the critical role of data traces—residual artifacts from the processes of data generation, transmission, and storage—in detecting tampering and verifying authenticity. By reviewing data trace detection techniques in stages and adapting them to the context of metrology, the study provides a detailed examination of the contributions and limitations of each phase. The research demonstrates that, although techniques such as timestamp consistency checks, deep learning-based anomaly detection, and volatile memory analysis show considerable potential in the metrology field, they are still constrained by the unique complexities of this domain, including the specialized storage architectures of metrological devices and the stringent compliance requirements for data uncertainty propagation as defined by metrological standards. Future research should focus on three key areas: developing automated verification models to handle large-scale heterogeneous data, advancing real-time forensic techniques for volatile memory in embedded devices, and building multi-modal frameworks to integrate trace data across the entire data lifecycle. Ultimately, through the study of data traces, the development of trustworthy detection for metrological data will take a more solid step forward.

Supplemental Information

PRISMA checklist.

DOI: 10.7717/peerj-cs.3106/supp-1

Download

PRISMA flow diagram.

DOI: 10.7717/peerj-cs.3106/supp-2

Download

[1] Akhtar N, Saddique M, Asghar K, Bajwa UI, Hussain M, Habib Z. 2022. Digital video tampering detection and localization: review, representations, challenges and algorithm. Mathematics 10(2):168

[2] Almeida G, Oliveira C, Melo W. 2024. Measurement frauds in fuel dispensers: how urban mobility simulators can help to combat them.

[3] Arfeen A, Asim Khan M, Zafar O, Ahsan U. 2022. Process based volatile memory forensics for ransomware detection. Concurrency and Computation: Practice and Experience 34(4):e6672

[4] Bahjat AA, Jones J. 2019. Deleted file fragment dating by analysis of allocated neighbors. Digital Investigation 28:S60-S67

[5] Barbosa CRH, Sousa MC, Almeida MFL, Calili RF. 2022. Smart manufacturing and digitalization of metrology: a systematic literature review and a research agenda. Sensors 22(16):6114

[6] Bellizzi J, Vella M, Colombo C, Hernandez-Castro J. 2022. Responding to targeted stealthy attacks on android using timely-captured memory dumps. IEEE Access 10:35172-35218

[7] Bharadwaj NK, Singh U. 2018. Efficiently searching target data traces in storage devices with region based random sector sampling approach. Digital Investigation 24:128-141

[8] Bharathiraja S, Rajesh Kanna B, Geetha S, Anusooya G. 2024. Unmasking the digital deception-a comprehensive survey on image forgery and detection techniques. Australian Journal of Forensic Sciences 56(6):635-683

[9] Bianchi F, Giannetto M, Careri M. 2018. Analytical systems and metrological traceability of measurement data in food control assessment. TrAC Trends in Analytical Chemistry 107:142-150

[10] Bouma J, Jonker H, van der Meer V, Van Den Aker E. 2023. Reconstructing timelines: from NTFS timestamps to file histories.

[11] Bourouis S, Alroobaea R, Alharbi AM, Andejany M, Rubaiee S. 2020. Recent advances in digital multimedia tampering detection for forensics analysis. Symmetry 12(11):1811

[12] Buchholz F, Spafford E. 2004. On the role of file system metadata in digital forensics. Digital Investigation 1(4):298-309

[13] Capasso P, Cattaneo G, De Marsico M. 2024. A comprehensive survey on methods for image integrity. ACM Transactions on Multimedia Computing, Communications and Applications 20(11):1-34

[14] Case A, Richard GG. 2016. Detecting objective-C malware through memory forensics. Digital Investigation 18:S3-S10

[15] Chhajed C. 2024. Comparison of persistence of deleted files on different file systems and disk types. Master’s Thesis. Purdue University

[16] Chow K-P, Law FY, Kwan MY, Lai PK. 2007. The rules of time on NTFS file system.

[17] da Costa KA, Papa JP, Passos LA, Colombo D, Del Ser J, Muhammad K, de Albuquerque VHC. 2020. A critical literature survey and prospects on tampering and anomaly detection in image data. Applied Soft Computing 97(6):106727

[18] Dija S, Ajana J, Indu V, Sabarinath M. 2020. Cyber forensics: discovering traces of malware on windows systems.

[19] Eichstädt S, Keidel A, Tesch J. 2021. Metrology for the digital age. Measurement: Sensors 18(4):100232

[20] Farid H. 2016. Photo forensics. Cambridge: MIT Press.

[21] Fiadufe F, Modi K, Shukla K, Etyang FO. 2025. Forensic investigation and analysis of malware in windows OS. International Journal of Electronic Security and Digital Forensics 17(1–2):169-182

[22] Galhuber M, Luh R. 2021. Time for truth: forensic analysis of NTFS timestamps.

[23] Garland L, Neyaz A, Varol C, Shashidhar NK. 2024. Investigating digital forensic artifacts generated from 3D printing slicing software: windows and Linux analysis. Electronics 13(14):2864

[24] González Arias R, Bermejo Higuera J, Rainer Granados JJ, Bermejo Higuera JR, Sicilia Montalvo JA. 2024. Systematic review: anti-forensic computer techniques. Applied Sciences 14(12):5302

[25] Govindaraj J, Mata R, Verma R, Gupta G. 2015. Forensic-ready secure iOS apps for jailbroken iPhones.

[26] Gregório J, Miller R, Duncan P. 2024. Hashing it out: unravelling the threads of data in digital manufacturing. Measurement: Sensors 38:101802

[27] Han J, Lee S. 2022. A study on the processing of timestamps in the creation of multimedia files on mobile devices. Journal of Information Processing Systems 18(3):402-410

[28] Hasan MK, Alkhalifah A, Islam S, Babiker NB, Habib AA, Aman AHM, Hossain MA. 2022. Blockchain technology on smart grid, energy trading, and big data: security issues, challenges, and recommendations. Wireless Communications and Mobile Computing 2022(1):9065768

[29] Heeger J, Yannikos Y, Steinebach M. 2021. ExHide: hiding data within the exFAT file system.

[30] Horsman G. 2021. Digital tool marks (DTMs): a forensic analysis of file wiping software. Australian Journal of Forensic Sciences 53(1):96-111

[31] Horsman G, Errickson D. 2019. When finding nothing may be evidence of something: anti-forensics and digital tool marks. Science & Justice 59(5):565-572

[32] Irons A. 2010. Digital forensics and measurement science. Measurement and Control 43(8):238-242

[33] Jacob AJ, Samuel IT. 2020. Development of mechanism for meter tamper detections and counter measures. Development 7(7):13641-13652

[34] Jeong D, Lee S. 2019. Forensic signature for tracking storage devices: analysis of UEFI firmware image, disk signature and windows artifacts. Digital Investigation 29(2):21-27

[35] Johnson C, Davies R. 2020. Using digital forensic techniques to identify contract cheating: a case study. Journal of Academic Ethics 18(2):105-113

[36] Johnson C, Davies R, Reddy M. 2022. Using digital forensics in higher education to detect academic misconduct. International Journal for Educational Integrity 18(1):12

[37] Joo D, Lee J, Jeong D. 2023. A reference database of windows artifacts for file-wiping tool execution analysis. Journal of Forensic Sciences 68(3):856-870

[38] Joun J, Lee S, Park J. 2023a. Data remnants analysis of document files in windows: microsoft 365 as a case study. Forensic Science International: Digital Investigation 46(4):301612

[39] Joun J, Lee S, Park J. 2023b. Discovering spoliation of evidence through identifying traces on deleted files in macOS. Forensic Science International: Digital Investigation 44(2):301502

[40] Kao D-Y. 2020. Forensic exchange analysis of contact artifacts on data hiding timestamps. Applied Sciences 10(13):4686

[41] Kaur N, Jindal N, Singh K. 2024. Passive image forgery detection techniques: a review, challenges, and future directions. Wireless Personal Communications 134(3):1491-1529

[42] Kaur N, Sharma A, Sharma V. 2024. Advancing automobile identification and brand discrimination from tyre rubber through machine learning algorithms for forensic investigations. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 309:123821

[43] Kessler GC. 2007. Anti-forensics and the digital investigator.

[44] Khalid O, Ullah S, Ahmad T, Saeed S, Alabbad DA, Aslam M, Buriro A, Ahmad R. 2023. An insight into the machine-learning-based fileless malware detection. Sensors 23(2):612

[45] Kirmani MS, Banday MT. 2024. Enhancing reliability during physical memory forensics: Strategies and practices. SN Computer Science 5(1):201

[46] Kumar M. 2021. Solid state drive forensics analysis—challenges and recommendations. Concurrency and Computation: Practice and Experience 33(24):e6442

[47] Kumar G, Thorat P, Agarwal R, Aswal D. 2022. A study of internet time dissemination services and their capabilities: NTP as a special case in time metrology. MAPAN 37(2):421-434

[48] Kuster M. 2021. Metrological data completeness for digital transformation.

[49] Lazzari A, Pou J-M, Dubois C, Leblond L. 2017. Smart metrology: the importance of metrology of decisions in the big data era. IEEE Instrumentation & Measurement Magazine 20(6):22-29

[50] Lee EJ, Lee SY, Kwon H, Lee SJ, Kim GB. 2023. Identification of data wiping tools based on deletion patterns in ReFS $ logfile. Forensic Science International: Digital Investigation 46:301607

[51] Lee R-K, Song H-M. 2024. Abnormal windows file path detection technique based on word embedding. Journal of the Korea Institute of Information Security & Cryptology 34(6):1459-1469

[52] Lees C. 2013. Determining removal of forensic artefacts using the USN change journal. Digital Investigation 10(4):300-310

[53] Leonzio DU, Cuccovillo L, Bestagini P, Marcon M, Aichroth P, Tubaro S. 2023. Audio splicing detection and localization based on acquisition device traces. IEEE Transactions on Information Forensics and Security 18:4157-4172

[54] Li S, Huo H. 2024. Continuity-attenuation captured network for frame deletion detection. Signal, Image and Video Processing 18(4):3285-3297

[55] Lins RG, Santos REd, Gaspar R. 2023. Vision-based measurement for quality control inspection in the context of industry 4.0: a comprehensive review and design challenges. Journal of the Brazilian Society of Mechanical Sciences and Engineering 45(4):229

[56] Liu Z, Gong H, Wang J, Xiong X. 2024. Research on the trusted level of metrology scientific data. Journal of Agricultural Big Data 6(2):253-258

[57] Lyle JR, Guttman B, Butler J, Sauerwein K, Reed C, Lloyd C. 2022. Digital investigation techniques: a NIST scientific foundation review.

[58] Manna M, Case A, Ali-Gombe A, Richard GG. 2022. Memory analysis of .NET and .Net Core applications. Forensic Science International: Digital Investigation 42:301404

[59] Matsakis D, Levine J, Lombardi M. 2018. Metrological and legal traceability of time signals.

[60] Meffert CS, Baggili I, Breitinger F. 2016. Deleting collected digital evidence by exploiting a widely adopted hardware write blocker. Digital Investigation 18:S87-S96

[61] Melnik S, Petrova E, Vikulin V. 2023. The measuring probes framework synchronizing for the digital metrology.

[62] Melo WS, Tarelho LV, Rodrigues Filho BA, Bessani AN, Carmo LF. 2021. Field surveillance of fuel dispensers using IoT-based metering and blockchains. Journal of Network and Computer Applications 175:102914

[63] Milicevic K, Omrcen L, Kohler M, Lukic I. 2022. Trust model concept for IoT blockchain applications as part of the digital transformation of metrology. Sensors 22(13):4708

[64] Mishra A, Singh C, Dwivedi A, Singh D, Biswal AK. 2021. Network forensics: an approach towards detecting cyber crime.

[65] Mohamed A, Khalid C. 2019. Detection of timestamps tampering in NTFS using machine learning. Procedia Computer Science 160(3–4):778-784

[66] Mohammad IA, Nasar AO, Alkhawaldeh M, Qazi E-U-H, Zia T. 2024. Anti-forensic challenges in digital forensics investigations: an overview of techniques and tools.

[67] Mustapää T, Tunkkari H, Taponen J, Immonen L, Heeren W, Baer O, Brown C, Viitala R. 2022. Secure exchange of digital metrological data in a smart overhead crane. Sensors 22(4):1548

[68] Neuner S, Voyiatzis AG, Schmiedecker M, Brunthaler S, Katzenbeisser S, Weippl ER. 2016. Time is on my side: steganography in filesystem metadata. Digital Investigation 18(5):S76-S86

[69] Neuner S, Voyiatzis AG, Schmiedecker M, Weippl ER. 2017. Timestamp hiccups: detecting manipulated filesystem timestamps on NTFS.

[70] Neyezhmakov P, Zub S, Pivnenko S. 2021. Preliminary e-infrastructure for digital calibration certificate. Measurement: Sensors 18(2):100332

[71] Neyezhmakov P, Zub S, Pivnenko S. 2022. Preliminary e-infrastructure for digital metrology. Measurement: Sensors 23(3):100398

[72] Nordvik R, Axelsson S. 2022. It is about time–do exFAT implementations handle timestamps correctly? Forensic Science International: Digital Investigation 42:301476

[73] Nordvik R, Stoykova R, Franke K, Axelsson S, Toolan F. 2021. Reliability validation for file system interpretation. Forensic Science International: Digital Investigation 37(2):301174

[74] Nordvik R, Toolan F, Axelsson S. 2019. Using the object ID index as an investigative approach for NTFS file systems. Digital Investigation 28:S30-S39

[75] Nyholm H, Monteith K, Lyles S, Gallegos M, DeSantis M, Donaldson J, Taylor C. 2022. The evolution of volatile memory forensics. Journal of Cybersecurity and Privacy 2(3):556-572

[76] Oh J, Lee S, Hwang H. 2024. Forensic detection of timestamp manipulation for digital forensic investigation. IEEE Access 12:72544-72565

[77] Oh DB, Park KH, Kim HK. 2020. De-Wipimization: detection of data wiping traces for investigating NTFS file system. Computers & Security 99(1):102034

[78] Padilha R, Salem T, Workman S, Andaló FA, Rocha A, Jacobs N. 2022. Content-aware detection of temporal metadata manipulation. IEEE Transactions on Information Forensics and Security 17:1316-1327

[79] Pagani F, Fedorov O, Balzarotti D. 2019. Introducing the temporal dimension to memory forensics. ACM Transactions on Privacy and Security (TOPS) 22(2):1-21

[80] Palmbach D, Breitinger F. 2020. Artifacts for detecting timestamp manipulation in NTFS on windows and their reliability. Forensic Science International: Digital Investigation 32:300920

[81] Parida T, Das S. 2021. PageDumper: a mechanism to collect page table manipulation information at run-time. International Journal of Information Security 20(4):603-619

[82] Parida T, Nath K, Das S. 2024. SAM: a mechanism to facilitate smear-aware forensic analysis of volatile system memory. Journal of Applied Security Research 19(2):300-329

[83] Park NI, Lee JW, Kim J-H, Lim JS, Na G-H, Jeon O. 2022. Forensic analysis method for forgery detection of call recordings generated by Samsung smartphones. Journal of Digital Forensics 16(1):142-159

[84] Penrose P, Buchanan WJ, Macfarlane R. 2015. Fast contraband detection in large capacity disk drives. Digital Investigation 12(7):S22-S29

[85] Pieterse H, Olivier MS, Van Heerden RP. 2015. Playing hide-and-seek: detecting the manipulation of android timestamps.

[86] Pieterse H, Olivier MS, Van Heerden RP. 2016. Reference architecture for android applications to support the detection of manipulated evidence. SAIEE Africa Research Journal 107(2):92-103

[87] Pieterse H, Olivier M, Van Heerden R. 2017. Evaluating the authenticity of smartphone evidence.

[88] Pieterse H, Olivier M, Van Heerden R. 2019. Detecting manipulated smartphone data on android and iOS devices.

[89] Purnaye P, Kulkarni V. 2022. A comprehensive study of cloud forensics. Archives of Computational Methods in Engineering 29(1):33-46

[90] Rab S, Wan M, Sharma RK, Kumar L, Zafer A, Saeed K, Yadav S. 2023. Digital avatar of metrology. MAPAN 38(3):561-568

[91] Ramezanzadehmoghadam M. 2024. A blockchain-based decentralized federated learning framework for deepfake detection in digital forensic scenarios. PhD thesis. Florida Agricultural and Mechanical University

[92] Rathore H, Nandanwar A, Sahay SK, Sewak M. 2023. Adversarial superiority in android malware detection: lessons from reinforcement learning based evasion attacks and defenses. Forensic Science International: Digital Investigation 44:301511

[93] Rekhis S, Boudriga N. 2011. A system for formal digital forensic investigation aware of anti-forensic attacks. IEEE Transactions on Information Forensics and Security 7(2):635-650

[94] Rekhis S, Boudriga N. 2012. A hierarchical visibility theory for formal digital investigation of anti-forensic attacks. Computers & Security 31(8):967-982

[95] Rzepka L, Ottmann J, Freiling F, Baier H. 2024. Causal inconsistencies are normal in windows memory dumps (Too) Digital Threats: Research and Practice 5(3):1-20

[96] Sakshi, Malik A, Sharma AK. 2024. A survey on blockchain based IoT forensic evidence preservation: research trends and current challenges. Multimedia Tools and Applications 83(14):42413-42458

[97] Samoilenko O, Tsiporenko S. 2023. Traceability ensuring by organizing the metrological measurements network. Acta IMEKO 12(4):1-8

[98] Santos R. 2020. The reliability of data from metrology 4.0. International Journal on Data Science and Technology 6(4):66-69

[99] Schneider J, Wolf J, Freiling F. 2020. Tampering with digital evidence is hard: the case of main memory images. Forensic Science International: Digital Investigation 32(6):300924

[100] Seo S, Kim Y, Lee C. 2021. Countering portable executable file header removing based anti-memory forensic techniques. The Journal of Digital Forensics (KDFS) 15(2):50-59

[101] Shaikh S, Kannaiah SK. 2025. A hybrid approach of traditional block-based deep learning for video forgery detection. Traitement du Signal 42(1):201-211

[102] Shelke NA, Kasana SS. 2021. A comprehensive survey on passive techniques for digital video forgery detection. Multimedia Tools and Applications 80(4):6247-6310

[103] Shey J, Blanco JA, Walker O, Tedesso TW, Ngo HT, Rakvic R, Fairbanks KD. 2018. Monitoring device current to characterize trim operations of solid-state drives. IEEE Transactions on Information Forensics and Security 14(5):1296-1306

[104] Singh B, Gupta G. 2019. Analyzing windows subsystem for Linux metadata to detect timestamp forgery.

[105] Son Y, Park JW. 2024. Detecting forged audio files using “mixed paste” command: a deep learning approach based on Korean phonemic features. Sensors 24(6):1872

[106] Song J-H, Lee H-S. 2023. A design of timestamp manipulation detection method using storage performance in NTFS. Journal of Internet of Things and Convergence 9(6):23-28

[107] Spennemann DH, Spennemann RJ, Singh CL. 2024. Examining and detecting academic misconduct in written documents using revision save identifier numbers in MS word as exemplified by multiple scenarios. Forensic Science International: Digital Investigation 51(4):301821

[108] Studiawan H, Sohel F, Payne C. 2019. A survey on forensic investigation of operating system logs. Digital Investigation 29(7):1-20

[109] Takegawa N, Furuichi N. 2023. Traceability management system using blockchain technology and cost estimation in the metrology field. Sensors 23(3):1673

[110] Thierry A, Müller T. 2022. A systematic approach to understanding MACB timestamps on Unix-like systems. Forensic Science International: Digital Investigation 40:301338

[111] Todd MC, Peterson GL. 2024. Temporal metadata analysis: a learning classifier system approach. Forensic Science International: Digital Investigation 51:301842

[112] Toolan F, Humphries G. 2025. Data hiding in the XFS file system. Forensic Science International: Digital Investigation 52:301884

[113] Toro FG, Lehmann H. 2021. Brief overview of the future of metrology. Measurement: Sensors 18:100306

[114] Ustubioglu B, Tahaoglu G, Ustubioglu A, Ulutas G, Kilic M. 2025. A novel audio copy move forgery detection method with classification of graph-based representations. IEEE Access 13:22029-22054

[115] Valeti K, Rathore H. 2024. GBKPA and AuxShield: addressing adversarial robustness and transferability in android malware detection. Forensic Science International: Digital Investigation 50:301816

[116] Vanini C, Hargreaves CJ, van Beek H, Breitinger F. 2024. Was the clock correct? Exploring timestamp interpretation through time anchors for digital forensic event reconstruction. Forensic Science International: Digital Investigation 49:301759

[117] Velychko O, Gordiyenko T. 2019. Metrological traceability at different measurement levels. In: Standards, Methods and Solutions of Metrology. London: IntechOpen.

[118] Wang J, Li Z, Zhang C, Chen J, Wu Z, Davis LS, Jiang Y-G. 2022. Fighting malicious media data: a survey on tampering detection and deepfake detection. ArXiv

[119] Wendt A. 2025. Securing UTC time comparisons through cryptographic signatures: establishing a trust model for time data exchanges.