時間 Keynote 議程-4F國際會議廳 
08:30~09:00 來賓報到
09:00~09:10 Opening
09:10~09:20 Special Guest │ 趨勢科技董事長 張明正
Cutting Edge Hadoop Technology and the Trend│ Andrew Purtell (Trend Micro 資深架構師,HBase PMC 提交人)
10:10~10:30 休息 / 攤位參觀
Scalable Machine Learning with Hadoop│ Grant Ingersoll (LucidWorks 首席科學家)
11:20~12:10 趨勢科技的雲端發現之旅 - 以 Hadoop 建構企業核心競爭力的歷程分享 │ 陳永強 (趨勢科技雲端解決方案總負責人)
12:10~13:30 午餐 / 攤位參觀
時間 A.「開發者」 B.「營運者」 C.「應用案例」
oozie introduction
楊詠成 (Gibson Yang) / 台灣雅虎 Yahoo!
Oozie is an open-source workflow / coordination service to manage data processing jobs for Apache Hadoop™. It is an extensible, scalable and data-aware service to orchestrate dependencies between jobs running on Hadoop (including HDFS, Pig and MapReduce).
In this talk, we will introduce oozie and share experience in Yahoo!

王耀聰 (Jazz Wang) /
Hadoop經過七年的開發,終於在2011年12月釋出1.0版本,象徵著 Hadoop已成熟到能支持企業營運需求。即便如此,目前Hadoop最令人怯步的關鍵在於「不夠友善」。初學者往往第一個要面對的問題是缺乏佈建叢集所需的背景知識。
在台灣,多數資訊從業人員仍以Windows為主要的作業系統。本次演講將跟各位聽眾分享一個名為Hadoop4Win的懶人包安裝程式,除了可以作為學習Hadoop生態系的第一步外,也可以作為開發Hadoop程式的實驗環境。其次將跟各位介紹如何使用hiCloud搭建 Hadoop 叢集。

辜文元 / 逢甲大學GIS中心

本次講題,將以逢甲大學GIS中心與國家太空中心共同合作的福衛二號衛星結合Hadoop雲端技術之研究成果為例,說明如何將Hadoop應用於地理資訊系統上。結合Hadoop做為影像管理及加值應用的基礎,發展衛星影像管理平台架構。內容包括了巨量福衛二號衛星影像管理,以及如何以Hadoop HDFS為基礎,發佈地圖服務,提供給廣大的GIS用戶端使用福衛影像資料。


Big Data, Hadoop and R
Laurence Liew /Revolution Analytics
This session will discuss the use of R within a Hadoop environment.
The motivation for the use of R, how R is used today inside and next-to a hadoop cluster. A short video of RHadoop will be shown.

[1] 關於 R 這個專案的簡介

[2] 關於 R-Hadoop 專案的簡介
Hadoop Security Overview - From Security Infrastructure Deployment to High-Level Services
施宏良 (Jason Shih) / Etu, SYSTEX Corp.
The increasing trend of adoption Hadoop open-source framework for speedy data processing and analytics capabilities for organizations to manage huge data volume have brought attention to enterprise wide security concern aiming for fine grain control of sensitive information and isolation from different level/group of access on sharing storage or computing facilities. Prior to Hadoop 0.20, Unix-like file permission were introduced, providing also cluster-wide simple authentication mechanism but lack of access control per job queue, submission and other operations. With Hadoop's new security feature and it's integration with Kerberos, it's now possible to bring strong authentication and authorization to ensure rigorous access control to data, resources and also isolation between running tasks. In this presentation, we will cover the deployment details of Hadoop security on cluster environment and implementation on high-level services base on kerberized security infrastructure. We introduce also the Etu Appliance providing fast-deployment, system-automation and built-in feature of cross-realm trust mechanism which fulfill the interoperation between existing Active Domain or external LDAP realm and help reducing both integration and operation-wide overhead from administrators.
Ad hoc Query- 輕輕鬆鬆查詢海量資料
蘇柏綸 (Alex Su) / 趨勢科技
Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs. We’ll present a case study of TrendMicro's integration of data analytics tools into its existing Hadoop-based, Pig-centric analytics platform. In our deployed solution, common data analytics tasks such as data sampling, feature generation, training, and testing can be accomplished quickly and directly in Pig, via carefully crafted loaders, storage functions, and user-defined functions.
14:50~15:20 休息 / 攤位參觀
Scalable Data Processing: Bulk Synchronous Parallel
林家弘 (Chia-Hung Lin) / 美商飛向科技(Fliptop Inc.)
Hadoop MapReduce[1] is a popular open source framework inspired by functional programming 's map and reduce functions, saving developers lots of works by covering many underlying complicated tasks. However, not all tasks fit into MapReduce's scenario, graph related computation task (e.g. social network analysis) is one such example. Google therefore developed their in-house product, Pregel[2], based on Bulk Synchronous Parallel[3] - a bridge model suitable for performing iterative algorithms, performing large scale graph processing.

1. What is Bulk Synchronous Parallel?
2. Apache Hama
3. Comparison between Hadoop MapReduce and Apache Hama
Hadoop 維運經驗分享
-規劃 Hadoop營運該注意的事項
張家豪 (James Chang) / 趨勢科技
Over the last few years, there has been a fundamental shift in data storage, management, and processing. Companies are storing more data from more sources in more formats than ever before.
This isn't just about being a "data pack-rat," but instead building products, features, and intelligence predicated on knowing more about their world (where their world can be users, searches, machine logs and so forth).
In this session, we’ll present a case study of TrendMicro's Hadoop Cluster operation about Namenode and Jobtracker HA, Network Topology and Metrics Monitoring Tools.

Mohohan: An on-line video transcoding service via Hadoop
陳俊翰 (Chun-Han Chen) / OgilvyOne
A famous cloud computing file system and developing framework named Hadoop is mainly designed for massive textual data management, such as counting, sorting, indexing, pattern finding, and so on. However, it is merely to seek a multimedia-oriented service via Hadoop. Mohohan is an on-line multimedia transcoding system for video resources, which implemented with Amazon Web Service (AWS) EC2, AWS S3, AWS EMR, Hadoop, and ffmpeg. Its goal is reducing the overall execution time by parallel transcoding via the Hadoop cluster. The concept of Mohohan is simple: 1) to divide the video into several chunk of frames, 2) to transcode the chunks in parallel with multiple nodes (i.e., task tracker) of Hadoop cluster, and 3) to merge the transcoded results into the output. On the homogeneous SaaS comparison, a test report from an impartial third party organization named CloudHarmony has been chosen. Finally, the experiment result shows that Mohohan performs quite better than other on-line video transcoding services mentioned in the test report, such as Encoding, Zencoder, Sorenson, and Panda.
設計高效能 HBase Schema--了解HBase運作方式與資料特徵
繆維武(Scott Miao)/趨勢科技
HBase是基於分散式檔案系統的資料庫,源自於Google的Big Table,提到表格 (Table),大眾一般都把它跟傳統關聯式資料庫 (Relational Database)聯想在一起;但就實務上,採用關聯式資料庫的設計方法,來設計HBase的schema,將會無法得到HBase的好處,更有甚者,會導致HBase效率低落!
Hadoop hardware and network best practices
何長興 (Kenneth Ho)
Hadoop is taking over a big chunk of the IT world. Many are already onboard, from Internet giants to cutting edge startups, from established multi-nation enterprises to SMBs serving local niche markets. Yet many more plan to hop on the Hadoop bandwagon.
One of the most important questions needs to be answered but less discussed per my observation in local communities, is hardware selection and network design. This talk attempts to shed some light on some of the best practices on how to go about selecting hardware and designing network for your new (or next) Hadoop cluster.

陳志昇 (Vincent Chen) / TCloud騰雲計算
精準行銷上的應用- Hadoop in 移動裝置上網行為分析:

注意事項 /