以文本方式查看主题

-  课外天地 李树青  (http://www.njcie.com/bbs/index.asp)
--  Java程序语言课件  (http://www.njcie.com/bbs/list.asp?boardid=17)
----  [推荐]第二次上机作业的说明——分词  (http://www.njcie.com/bbs/dispbbs.asp?boardid=17&id=457)

--  作者:admin
--  发布时间:2008/3/25 22:42:21
--  [推荐]第二次上机作业的说明——分词
public class WordSegmentation
{
        public static void main(String[] args)
        {
                String[] stopList = { "an", "and", "are", "as", "at", "be", "by",
                                "for", "from", "has", "he", "in", "is", "it", "its", "of",
                                "on", "that", "the", "to", "was", "were", "will", "with" };
                String doc = "The search trees overcome many issues of hash dictionary";
                java.util.Arrays.sort(stopList);
                String[] result = doc.toLowerCase().split("\\\\W");
                
                for(int i=0;i<result.length;i++)
                {
                        if(result[i].equals(""))
                                continue;
                        if(java.util.Arrays.binarySearch(stopList,result[i])<0)
                                System.out.println(result[i]);                  
                }
        }
}
[此贴子已经被作者于2010-12-12 08:24:12编辑过]