The content and services available on the web continue to
be accessed mostly through direct human control. But this
is changing. Increasingly, users rely on automated agents
that save them time and effort by programmatically
retrieving content, performing complex interactions, and
aggregating data from diverse sources.
Programming Spiders,
Bots, and Aggregators in
Java teaches you how to build and
deploy a wide variety of these agents–from
single-purpose bots to exploratory spiders to aggregators
that present a unified view of information from multiple
user accounts.You will quickly build on your
basic
knowledge of
Java to quickly master the techniques that are
essential to this specialized world of programming,
including parsing
HTML, interpreting data, working with
cookies, reading and writing XML, and managing high-volume
workloads. You'll also learn about the ethical issues
associated with bot use--and the limitations imposed by
some websites. This book offers two levels of
instruction,
both of which are focused on the library of routines
provided on the companion CD. If your main concern is
adding ready-made functionality to an application, you'll
achieve your goals quickly thanks to step-by-step
instructions and sample programs that illustrate effective
implementations. If you're interested in the technologies
underlying these routines, you'll find in-depth
explanations of how they work and the techniques required
for customization.
Contents
- Introduction
- Chapter 1 Java Socket Programming
- Chapter 2 Examining the Hypertext Transfer
Protocol
- Chapter 3 Accessing Secure Sites with HTTPS
- Chapter 4 HTML Parsing
- Chapter 5 Posting Forms
- Chapter 6 Interpreting Data
- Chapter 7 Exploring Cookies
- Chapter 8 Building a Spider
- Chapter 9 Building a High-Volume Spider
- Chapter 10 Building a Bot
- Chapter 11 Building an Aggregator
- Chapter 12 Using Bots Conscientiously
- Chapter 13 The Future of Bots
- Appendix A The Bot Package
- Appendix B Various HTTP Related Charts
- Appendix C Troubleshooting
- Appendix D Installing Tomcat
- Appendix E How to Compile Examples Under Windows
- Appendix F How to Compile Examples Under UNIX
- Appendix G Recompiling the Bot Package
- Glossary
- Index