Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Instant Jsoup How-to
Instant Jsoup How-to
Instant Jsoup How-to
Ebook80 pages21 minutes

Instant Jsoup How-to

Rating: 0 out of 5 stars

()

Read preview

About this ebook

In Detail

As you might know, there are a lot of Java libraries that support parsing HTML content out there. Jsoup is yet another HTML parsing library, but it provides a lot of functionalities and boasts much more interesting features when compared to others. Give it a try, and you will see the difference!

Instant jsoup How-to provides simple and detailed instructions on how to use the Jsoup library to manipulate HTML content to suit your needs. You will learn the basic aspects of data crawling, as well as the various concepts of Jsoup so you can make the best use of the library to achieve your goals.

Instant jsoup How-to will help you learn step-by-step using real-world, practical problems. You will begin by learning several basic topics, such as getting input from a URL, a file, or a string, as well as making use of DOM navigation to search for data. You will then move on to some advanced topics like how to use the CSS selector and how to clean dirty HTML data. HTML data is not always safe, and because of that, you will learn how to sanitize the dirty documents to prevent further XSS attacks.

Instant jsoup How-to is a book for every Java developer who wants to learn HTML manipulation quickly and effectively. This book includes the sample source code for you to refer to with a detailed explanation of every feature of the library.

Approach

Filled with practical, step-by-step instructions and clear explanations for the most important and useful tasks. This book will take a how-to approach, focusing on recipes that demonstrate Jsoup.

Who this book is for

If you are working in data scraping, data crawling, or within a similar area using Java, then this book is the one for you. This book acts as a fast-paced and simple guide to enhance your HTML data manipulating skills using one of the most well-known libraries, Jsoup.

LanguageEnglish
Release dateJun 7, 2013
ISBN9781782168003
Instant Jsoup How-to
Author

Pete Houston

Pete Houston is a B.S in Computer Science, having graduated from university in South Korea. He has been working in the IT industry for 10 years, and his work experience includes medical image researching to diagnose cancer symptoms using technologies such as C, C++, COM/DLL, ActiveX Control, and C#.NET 3.0. Pete has also designed and created an Android mobile platform. Currently, he deals with researching and implementing search algorithms for data mining, which includes C, Apache, Python, and Hadoop. Pete has also worked as Technical Leader for backend systems to provide information services. He has already worked with Java, Jsoup, PHP, SimpleXML and Yii\Slim Framework.

Related to Instant Jsoup How-to

Related ebooks

Internet & Web For You

View More

Related articles

Reviews for Instant Jsoup How-to

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Instant Jsoup How-to - Pete Houston

    Table of Contents

    Instant Jsoup How-to

    Credits

    About the Author

    About the Reviewers

    www.PacktPub.com

    Support files, eBooks, discount offers and more

    Why Subscribe?

    Free Access for Packt account holders

    Preface

    What this book covers

    What you need for this book

    Who this book is for

    Conventions

    Reader feedback

    Customer support

    Downloading the example code

    Errata

    Piracy

    Questions

    1. Instant Jsoup How-to

    Giving input for parser (Must know)

    How to do it...

    How it works...

    There's more...

    Extracting data using DOM (Must know)

    Getting ready

    How to do it...

    How it works...

    There's more...

    Extracting data using CSS selector (Must know)

    Getting ready

    How to do it...

    How it works...

    There's more...

    Transforming HTML elements (Must know)

    How to do it...

    How it works...

    There's more...

    Miscellaneous Jsoup options (Should know)

    How to do it...

    There's more...

    Cleaning dirty HTML documents (Become an expert)

    Getting ready

    How to do it...

    How it works...

    There's more...

    Tags removed in

    Listing all URLs within an HTML page (Should know)

    How to do it...

    How it works...

    There's more...

    Listing all images within an HTML page (Should know)

    How to do it...

    How it works...

    Instant Jsoup How-to


    Instant Jsoup How-to

    Copyright © 2013 Packt Publishing

    All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

    Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held

    Enjoying the preview?
    Page 1 of 1