A study on feature selection method integrating technological network features and stability scoring for high-value patent identification


Abstract

High-value patents represent critical assets for assessing technological competitiveness and steering industrial advancement, as they embody both fundamental innovations and market potential. However, existing approaches to high-value patent identification largely rely on structured or textual features, which present notable limitations in feature construction and text representation. On the one hand, feature redundancy and weak correlations undermine the interpretability of selected features. On the other hand, most textual features remain at a superficial level, lacking in-depth exploration of inter-patent technological relationships and network structures, thereby constraining identification performance. To address these issues, we propose a feature selection method that integrates technological network features with stability scoring for high-value patent identification. First, technical phrases are extracted from patent texts to construct semantic–co-occurrence networks and topic-clustered networks. Diverse edge-weighting strategies are designed for these networks to quantify inter-patent associations and capture domain-specific structural characteristics. Second, to tackle feature redundancy, we propose a feature selection method based on the stability scoring of random forest feature importance rankings. By repeatedly resampling the data, the fluctuations of feature rankings are statistically analyzed and transformed into stability scores. Building upon this, the Sequential Forward Floating Selection(SFFS) algorithm is employed to identify key features that effectively characterize high-value patents, thereby enhancing interpretability. Experiments conducted on UCI datasets demonstrate that, compared with traditional stability scoring, random forest ranking, and related methods, the proposed approach achieves superior performance in classification tasks. Finally, the proposed method is applied to an empirical study on high-value patent identification. The results demonstrate that integrating network features with structured features and stability-score-based feature selection not only enhances the performance of high-value patent identification but also further validates the importance of network features in the interpretability analysis of the selected features. In conclusion, the proposed method enhances the accuracy of high-value patent identification while providing new perspectives for understanding the technological core and innovative contributions of patents, thereby offering strong support for technology innovation management and industrial decision-making.
Ask to review this manuscript

Notes for potential reviewers

  • Volunteering is not a guarantee that you will be asked to review. There are many reasons: reviewers must be qualified, there should be no conflicts of interest, a minimum of two reviewers have already accepted an invitation, etc.
  • This is NOT OPEN peer review. The review is single-blind, and all recommendations are sent privately to the Academic Editor handling the manuscript. All reviews are published and reviewers can choose to sign their reviews.
  • What happens after volunteering? It may be a few days before you receive an invitation to review with further instructions. You will need to accept the invitation to then become an official referee for the manuscript. If you do not receive an invitation it is for one of many possible reasons as noted above.

  • PeerJ Computer Science does not judge submissions based on subjective measures such as novelty, impact or degree of advance. Effectively, reviewers are asked to comment on whether or not the submission is scientifically and technically sound and therefore deserves to join the scientific literature. Our Peer Review criteria can be found on the "Editorial Criteria" page - reviewers are specifically asked to comment on 3 broad areas: "Basic Reporting", "Experimental Design" and "Validity of the Findings".
  • Reviewers are expected to comment in a timely, professional, and constructive manner.
  • Until the article is published, reviewers must regard all information relating to the submission as strictly confidential.
  • When submitting a review, reviewers are given the option to "sign" their review (i.e. to associate their name with their comments). Otherwise, all review comments remain anonymous.
  • All reviews of published articles are published. This includes manuscript files, peer review comments, author rebuttals and revised materials.
  • Each time a decision is made by the Academic Editor, each reviewer will receive a copy of the Decision Letter (which will include the comments of all reviewers).

If you have any questions about submitting your review, please email us at [email protected].