f:id:upura:20171029125932p:plain

動かし方

「一角獣のホーン」と「回復カード」が来るまで、少し工夫して動く必要がある。

パターン1＿両方来ていない場合

「ゲート・ガーディアン」を守備表示で特殊召喚する。ライフ500で攻撃表示にしておくと、破壊された際に600ポイントダメージの装備カード付きの相手モンスターが特攻してくる。ドロー時には祈る。

パターン2＿「一角獣のホーン」のみが来た場合

「ゲート・ガーディアン」を守備表示で特殊召喚し、「一角獣のホーン」を装備する。これで装備カードなどで強化された「古代の機械巨人」にも倒されなくなる。一安心。

パターン3＿「回復カード」のみが来た場合

「回復カード」を使用し、「ゲート・ガーディアン」を攻撃表示で特殊召喚する。ライフが600より多ければ、特攻してこない。攻撃表示の方が、装備カードなどで強化された「古代の機械巨人」に倒されるリスクが低くなる（「ゲート・ガーディアン」は攻3750/守3400のため）。

相手の攻撃表示モンスターは原則攻撃しない。「効果ダメージのみで勝利」のボーナスが付かなくなってしまうため。守備表示モンスターは「モンスターを◯体破壊」のボーナスを付けるため倒した方が良いが、相手に伏せカードがある場合は、「重力解除」で守備にされてしまう可能性があるので、攻撃しない方が良い。

パターン4＿両方来た場合

「回復カード」を使用し、「一角獣のホーン」を装備した「ゲート・ガーディアン」を攻撃表示にする。

パターン4移行

相手の守備表示モンスターを倒しながら、自分のデッキが0枚になったタイミングで、勝負を決めに行く。極力シャインカードを消費する。

2017-10-27

【論文メモ】Block segmentation and text area extraction of vertically/horizontally written document

論文メモ自然言語処理

論文名

N. Amamoto, S. Torigoe, Y. Hirogaki: Block segmentation and text area extraction of vertically/horizontally written document, Document Analysis and Recognition, 1993., Proceedings of the Second International Conference, 1993.
Block segmentation and text area extraction of vertically/horizontally written document - IEEE Conference Publication

概要

図表やテキストが混在した文書から、OCRの活用に向けて、縦書き/横書きを問わず文書部分を自動抽出する手法を提案。検証では83％の精度（255 of the 309）を示した。

先行研究との差異

先行研究
- Whal et al.[1] proposed the method to extract blocks by smearing process for a document image.
- Tsujimoto et al.[2] presented a method to extract adjacent connected components as a segment and to integrate them for the extraction of text area, figure etc.
- The approach to use white spaces as the basis for segmentation was proposed by Baird et al. [3].
In this paper, we present the block segmentation and text area extraction method using the white spaces of the document image, without qualifying the form of document such as vertical/horizontal writing.

キモとなる技術や手法

2.3 Judgment of Vertical/Horizontal Writing に書かれている
先行研究の技術を、日本語に対応できるよう工夫

Block抽出
→Textか判定
→Blockの縦横比で、縦書きか横書きか判定
→全てのBlockで縦書き/横書きのどちらが多いか合計を比較し、文書全体が縦書きか横書きかを判定

有用性の検証

45の文書で検証。309あるText部のうち、255を抽出できた
平均処理時間は1文書当たり5秒（で、有用といえる）

議論

縦書き/横書きが1枚に混在している文書
歪んだ入力画像
文書全体ではなく部分画像

でも取り組みたい

次に読む

[1] F.M.W ahl, K.Y.Wong, R.G.Casey, "Block segmentation in mixed text/image documents", Computer Graphics and Image Processing, vol.20, pp.375-390, 1982.
[2] S.Tsujimoto and H.Asada, "Major components of a complete text reading system" ,in Proc.of the IEEE, vol.80, No.7, pp.1133-1149, July 1992.
[3] H.S.Baird, "Anatomy of a versatile page reader", in Proc. of the IEEE, vol.80, No.7, pp.1059-1065, July 1992.

2017-10-25

Facebookの時系列解析ライブラリ「Prophet」で日経平均株価の終値を予測

python Kaggle

データは、以下の記事と同様に取得しました（2007年〜2017年9月8日）。
upura.hatenablog.com

今回は下記の記事に沿って、デフォルト設定で処理してみました。
d.hatena.ne.jp

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import pandas as pd
from matplotlib import pyplot as plt

df = pd.read_csv('nikkei_index.csv')
plt.plot(df['y'])

from fbprophet import Prophet
model = Prophet()
model.fit(df)
future_df = model.make_future_dataframe(365)

forecast_df = model.predict(future_df)
model.plot(forecast_df)
plt.show()

実データ

f:id:upura:20171025175325p:plain

365日分の予測結果を追記

f:id:upura:20171025175331p:plain

2017-10-24

python-mecab+sklearnを用いて、文章を分かち書きしてTF-IDFでベクトル化する

python 自然言語処理

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import MeCab
from sklearn.feature_extraction.text import TfidfVectorizer

# Initial settings
vectorizer = TfidfVectorizer(use_idf=True, token_pattern=u'(?u)\\b\\w+\\b')

# Global variables
docs = [] # Vectors for Tfidf scores

def wakati(target_text):
    t = MeCab.Tagger("-Owakati")
    result = t.parse(target_text)
    return result

def calcTfidfVec(new_text):
    docs.append(new_text)
    vecs = vectorizer.fit_transform(docs)
    return vecs.toarray()

# Test
print(calcTfidfVec(wakati("押すなよ")))
print(calcTfidfVec(wakati("押すなよ")))
print(calcTfidfVec(wakati("絶対に押すなよ")))

2017-10-24

Macにpython-mecab導入(2017年10月)

python 自然言語処理

自分用メモ

Windows版はこちら
Windowsにpython-mecab導入(2017年11月) - u++の備忘録

端末

macOS Sierra, version 10.12.6

Python環境

Python 3.6.2 | Anaconda
※下記サイトに沿って構築
qiita.com

python-mecab導入

mecab本体のインストール

brew install mecab

ipaの辞書のインストール

brew install mecab-ipadic

pythonのmecabクライアントのインストール

ファイルの取得

curl -O https://pypi.python.org/packages/86/e7/bfeba61fb1c5d1ddcd92bc9b9502f99f80bf71a03429a2b31218fc2d4da2/mecab-python-0.996.tar.gz

ファイルの解凍・移動

tar zxfv mecab-python-0.996.tar.gz
cd mecab-python-0.996

setup.pyの修正

vim setup.py

def cmd2(str):
    #return string.split (cmd1(str))
    return cmd1(str).split()

ビルド

sudo python3 setup.py install

動作確認

python

import MeCab
text = “すもももももももものうち”
t = MeCab.Tagger("-Owakati")
result = t.parse(text)
print(result)
# 出力
# すもも も もも も もも の うち

参考

https://chat-rate.com/it/2280/

2017-10-23

【論文メモ】Swish: a Self-Gated Activation Function

論文メモ

論文名

Prajit Ramachandran, Barret Zoph, Quoc V. Le: Swish: a Self-Gated Activation Function, arXiv:1710.05941 [cs.NE], 20171016.
[1710.05941] Searching for Activation Functions

どんなもの？

現在主流のReLUに替わる、 ${f(x)=x\sigma(x)}$ という簡易な活性化関数を提案
特に深層のニューラルネットワークに関して、複数の難しいDatasetで、ReLUを上回る性能を示した

upura.hatenablog.com
upura.hatenablog.com

先行研究と比べてどこがすごい？

現状ReLUが標準になっている
ReLUに取って代わろうとする活性化関数も多く提案されているが、複数のDatasetで完全にReLUを上回ることはできていない
Swishは画像分類・機械翻訳など複数のタスクに関して、深層ニューラルネットワークモデルで、ReLUを上回る性能を示した

技術や手法のキモはどこ？

unbounded above
- Unboundedness is desirable because it avoids saturation
bounded below
- Functions that approach zero in the limit induce even larger regularization effects because large negative inputs are “forgotten”
non-monotonic
- it produces negative outputs for small negative inputs
- The non-monotonicity of Swish increases expressivity and improves gradient flow
- This property may also provide some robustness to different initializations and learning rates
smooth
- smoothness plays a beneficial role in optimization and generalization

どうやって有効だと検証した？

さまざまな条件・設定でReLUやその他の活性化関数と比較（モデルやハイパーパラメータはReLUでの設定のまま）

ReLUとの比較
- DNNの学習効率
- バッチサイズごとのロバスト性
（ReLU含む）その他の活性化関数との比較
- CIFAR
- ImageNet
- 機械翻訳

議論はある？

Swish向けのモデルやハイパーパラメータを探索したい

次に読むべき論文は？

ReLUの論文
Richard HR Hahnloser, Rahul Sarpeshkar, Misha A Mahowald, Rodney J Douglas, and H Sebastian Seung.
Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature, 405(6789):
947, 2000.

2017-10-23

Pythonで活性化関数Swishを書く

python 論文メモ

※ 10月27日にVersion2が出ていたので、続編書きました。
upura.hatenablog.com

追記以上
ーーー

Google Brainが10月16日に公開した、深層のニューラルネットワークの画像分類・機械翻訳タスクでReLUを上回る性能を示したという活性化関数「Swish」をPythonで描画。論文のFigure 1に当たる図。

f:id:upura:20171023105753p:plain

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import numpy as np
import matplotlib.pyplot as plt

def sigmoid(x):
    return 1.0 / (1.0 + np.exp(-x))

def swish(x):
    return x * sigmoid(x)

n = 1000
x = np.linspace(-5, 3, n)
y = swish(x)

plt.title("Swish")
plt.plot(x, y, "r")
plt.show()