Computer Scientists Developing Technology To Improve Data Mining For Homeland Security

From online news articles to blogs, a massive amount of information is voluntarily being put before the public every day.

Some of this information may be valuable to protecting homeland security. However, to sift through this readily available content and summarize it for agencies like the Department of Homeland Security, analysts need to do more than sit at a computer, entering words like "al-Quaida" into Internet search engines.

That's why Kansas State University's William Hsu and other computer scientists who research data mining are part of a project to develop technology that makes automated Internet searches more useful and productive.

"We're helping to develop the next generation of Web search and crawling," Hsu said. "Our goal is to develop a research program that will help with homeland security. The Department of Homeland Security wants to pull information that's available to anyone in the public domain, like millions of articles from sources like CNN and Al-Jazeera, and monitor them for security."

Hsu is an associate professor of computer and information sciences, head of K-State's Laboratory for Knowledge Discovery in Databases, and co-principal investigator of a Department of Homeland Security-funded summer institute aimed at training future researchers in data sciences. The $2.4 million Data Sciences Summer Institute, headed by the University of Illinois along with K-State and the University of Texas San Antonio, is titled "Multimodal Information Access and Synthesis." The Illinois-led cooperative is one of four such University Affiliate Centers nationwide.

Data mining is a way of processing vast amounts of information and putting it in multiple, useful formats. Hsu's data mining research at K-State includes applications in fields like genome analysis, nanoscale materials modeling and diagnostic medicine. The work at K-State that will benefit homeland security strives to resolve ambiguity in Internet searches. For instance, this would allow a search engine to differentiate between homeland security as a concept and Homeland Security as a government agency. Hsu said that one of the institute's projects aims to improve name recognition, a heavily studied problem in information extraction.

"The goal is to develop an automated system that can pick out al-Quaida as an organization, Kandahar as a place and Osama bin Laden as a person, based upon rules developed from previously-seen documents," Hsu said. "Subcategories are a problem," he said. "'People' is a big tag. Is this a head of state? A celebrity? Someone who was interviewed?"

Data mining research at K-State and collaborating institutions is helping solve another problem with getting information off the Internet -- inefficient crawling. Hsu said search engines provide up-to-date results by first looking through vast numbers of Web pages and archiving them in a process called crawling. Hsu said the project leader, Kevin Chang at the University of Illinois, describes the problem with this process as "crawling in the dark -- you start somewhere and grab everything." Hsu said research in this area will lead to better searches whereby search engines can anticipate keywords, for instance. Search engines also could create virtual neighborhoods of information in which connections are made among bits of information based on the results of similar searches.

Although text-based searches have their complications, Hsu said searching for images is even harder because searches rely on the words people use to describe the images, such as a photo caption. Data mining research at K-State and its partner institutions is leading to technology that will allow search engines to "look" through images from the Web. Hsu said search engines would sift through images that are automatically annotated, or marked up, to describe their contents. This would be done using tools that analyze the shape, border, color and orientation of objects, among many other features, to pick out, for instance, an image of George W. Bush in a press conference photo.

"Computers will figure out an image identity by 'seeing' a feature that all such images have in common," Hsu said.

The next generation of data mining research, Hsu said, will involve computer scientists working with social scientists. By scouring news articles and other public data, researchers can work on something called sentiment analysis.

"Sometimes Homeland Security just needs to know, for instance, what the local reaction is to a particular event such as a bomb threat or recent explosion," Hsu said.

Featured

  • Security Industry Association Announces the 2026 Security Megatrends

    The Security Industry Association (SIA) has identified and forecasted the 2026 Security Megatrends, which form the basis of SIA’s signature annual Security Megatrends report defining the top 10 factors influencing both near- and long-term change in the global security industry. Read Now

  • The Future of Access Control: Cloud-Based Solutions for Safer Workplaces

    Access controls have revolutionized the way we protect our people, assets and operations. Gone are the days of cumbersome keychains and the security liabilities they introduced, but it’s a mistake to think that their evolution has reached its peak. Read Now

  • A Look at AI

    Large language models (LLMs) have taken the world by storm. Within months of OpenAI launching its AI chatbot, ChatGPT, it amassed more than 100 million users, making it the fastest-growing consumer application in history. Read Now

  • First, Do No Harm: Responsibly Applying Artificial Intelligence

    It was 2022 when early LLMs (Large Language Models) brought the term “AI” into mainstream public consciousness and since then, we’ve seen security corporations and integrators attempt to develop their solutions and sales pitches around the biggest tech boom of the 21st century. However, not all “artificial intelligence” is equally suitable for security applications, and it’s essential for end users to remain vigilant in understanding how their solutions are utilizing AI. Read Now

  • Improve Incident Response With Intelligent Cloud Video Surveillance

    Video surveillance is a vital part of business security, helping institutions protect against everyday threats for increased employee, customer, and student safety. However, many outdated surveillance solutions lack the ability to offer immediate insights into critical incidents. This slows down investigations and limits how effectively teams can respond to situations, creating greater risks for the organization. Read Now

New Products

  • A8V MIND

    A8V MIND

    Hexagon’s Geosystems presents a portable version of its Accur8vision detection system. A rugged all-in-one solution, the A8V MIND (Mobile Intrusion Detection) is designed to provide flexible protection of critical outdoor infrastructure and objects. Hexagon’s Accur8vision is a volumetric detection system that employs LiDAR technology to safeguard entire areas. Whenever it detects movement in a specified zone, it automatically differentiates a threat from a nonthreat, and immediately notifies security staff if necessary. Person detection is carried out within a radius of 80 meters from this device. Connected remotely via a portable computer device, it enables remote surveillance and does not depend on security staff patrolling the area.

  • Compact IP Video Intercom

    Viking’s X-205 Series of intercoms provide HD IP video and two-way voice communication - all wrapped up in an attractive compact chassis.

  • Automatic Systems V07

    Automatic Systems V07

    Automatic Systems, an industry-leading manufacturer of pedestrian and vehicle secure entrance control access systems, is pleased to announce the release of its groundbreaking V07 software. The V07 software update is designed specifically to address cybersecurity concerns and will ensure the integrity and confidentiality of Automatic Systems applications. With the new V07 software, updates will be delivered by means of an encrypted file.