Can we measure beauty? Computational evaluation of coral reef aesthetics

View article
PeerJ

Main article text

 

Introduction

Methods

Results

Discussion

Human response to visual cues

Crowd sourcing & historic data mining

Socioeconomic assessment for stakeholders

Supplemental Information

Relative importance of features

Relative importance of all 109 features derived from a random forest approach. Features are grouped into three general feature groups, texture of the entire image (texture), color and brightness of the entire image (color), and size, color and brightness, and distribution of objects within the image (objects).

DOI: 10.7717/peerj.1390/supp-1

Feature examples

Example pictures of a healthy (left) and a degraded reef (right). The applied measurements for brightness contrast across the whole image f28 shows 97 for (A) and 47 for (B). The green lines depict the central focus region which outlines the segment of interest used for ‘Rule of Third’ features. The orange line marks the Focus region used for features f55 through f56, where an additional margin (μ = 0.1) has been included. (C) and (D) show pictures after segmentation by K means, using K = 2 and m = 1. From these images the number of connected components can be calculated by implementing feature f58 (C = 1,470, D = 2,369). (E) and (G) show the Laplacian image produced for feature f30, (F) and (H) show the resized and normalized Laplacian image which serves as basis for the calculation of f31. The blue bounding boxes contain 81% (E = 0.623, G = 0.611) and 96.04% (F = 0.089, H = 0.079) of the edge energy respectively. (I) and (J) show the images after a three-level wavelet transform performed on the saturation channel IS. K gives an overview of color models used to compare the analyzed images, or objects within images against. The bar chart shows the average NCEAS score where pictures matching to the respective color model were taken. The red boxes indicate the model that fits best to image A and B respectively.

DOI: 10.7717/peerj.1390/supp-2

Raw data MATLAB code for feature extraction

Matlab script to extract 109 aesthetic features.

DOI: 10.7717/peerj.1390/supp-3

Overview of implemented features

Overview of all 109 implemented features given along their relative importance for the combined coral reef aesthetic value, a short description and the study the respective feature was derived from (1, Datta et al., 2006; 2, Li & Chen, 2009; 3, Ke, Tang & Jing, 2006; 4, this study).

DOI: 10.7717/peerj.1390/supp-4

Overview of sampling locations

Overview of sampling locations with coordinates along with their respective NCEAS scores the number of photographic images used from each location and their calculated coral reef aesthetic value. Tukey connecting letters report indicates sites with significant difference in their coral reef aesthetic value.

DOI: 10.7717/peerj.1390/supp-5

Additional Information and Declarations

Competing Interests

The authors declare there are no competing interests.

Author Contributions

Andreas F. Haas conceived and designed the experiments, performed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Marine Guibert, Sandi Calhoun and Emma George performed the experiments, analyzed the data, prepared figures and/or tables, reviewed drafts of the paper.

Anja Foerschner conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, reviewed drafts of the paper.

Tim Co performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, prepared figures and/or tables, reviewed drafts of the paper.

Mark Hatay and Phillip Dustan performed the experiments, contributed reagents/materials/analysis tools, prepared figures and/or tables, reviewed drafts of the paper.

Elizabeth Dinsdale conceived and designed the experiments, performed the experiments, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Stuart A. Sandin and Jennifer E. Smith performed the experiments, contributed reagents/materials/analysis tools, reviewed drafts of the paper.

Mark J.A. Vermeij performed the experiments, contributed reagents/materials/analysis tools, wrote the paper, reviewed drafts of the paper.

Ben Felts performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, reviewed drafts of the paper.

Peter Salamon conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Forest Rohwer conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, reviewed drafts of the paper.

Funding

The work was funded by the Gordon and Betty Moore Foundation, Investigator Award 3781 to FR. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Acknowledgements

 

We thank the biosphere foundation for providing the pictures of Carysfort reef. We further thank the Captain, Martin Graser, and crew of the M/Y Hanse Explorer.

Appendix: Feature extraction

 

Global features

Global features are computed over all the pixels of an entire image.

Color: The HSL (hue, saturation, lightness) and HSV (hue, saturation, value) color spaces are the two most common cylindrical-coordinate representations of points in an RGB color model. The HSV and HSL color space define pixel color by its hue, saturation and value, respectively lightness (Joblove & Greenberg, 1978). This provides a color definition similar to the human visual perception. The first step for each picture analysis was therefore to calculate the average hue, saturation and value respectively lightness for both color spaces. Assuming a constant hue, the definition of saturation and of value and lightness are very much different. Therefore hue, saturation, and value of a pixel in the HSV space will be denoted as IH(m, n), IS(m, n) and IV(m, n), and hue, saturation and lightness in the HSL space as IH_(m, n), IS_(m, n) and IL_(m, n) from here on, where m and n are the number of rows and columns in each image. f1=1MNnmIH(m,n) f2=1MNnmIS(m,n) f3=1MNnmIV(m,n) f4=1MNnmIS_(m,n) f5=1MNnmIL_(m,n). To assess colorfulness the RGB color space was separated in 64 cubes of identical volume by dividing each axis in four equal parts. Each cube was then considered as individual sample point and color distribution D1 of each image defined as the frequency of color occurrence within each of the 64 cubes. Additionally a reference distribution D0 was generated so that each sample point had a frequency of 1/64. The colorfulness of an image was then defined as distance between these two distributions, using the Quadratic-form distance (Ke, Tang & Jing, 2006) and the Earth Mover’s Distance (EMD). Both features take the pair-wise euclidian distances between the sample points into account. Assuming ci is now the center position of the i-th cube, we get dij = ‖rgb2luv(ci) − rgb2luv(cj)‖2 after a conversion to the LUV (Adams chromatic valence space; Adams, 1943) color space. This leads to f6=(hh0)TA(hh0)andf7=emd(D1,D0,{dij|1<i,j<64}) in which h and h0 are vectors listing the frequencies of color occurrence in D1 and D0. A = (aij) is a similarity matrix with aij = 1 − dij/dmax and dmax = max(dij); ‘emd’ denotes the earth mover’s distance we implemented using an algorithm described by Rubner, Tomasi & Guibas (2000).

For color analysis only pixels with a saturation Is_(m, n) < 0.2 and a lightness IL_ ∈ [0.15, 0.95] were used as the human eye is unable to distinguish hues and only sees shades of grey outside this range. As PH = {(m′, n′)|IS_ > 0.2 and 0.15 < IL_ < 0.95} represents the set of pixels whose hues can be perceived by humans, f8 was defined as the most frequent hue in each image and f9 as the standard deviation of colorfulness. f8=min(hmax), where ∀ hue h, # of {(m′, n′) ∈ PH|IH_ = hmax} ≥ # of {(m′, n′)} ∈ PH|IH_ = h. If hues had an identical cardinal, the smallest one was chosen. f9=std(var(IH_)). where IH_(m,n)=IH_(m,n) if (m, n) ∈ PH; otherwise IH_(m,n)=0. var (IH_) is the vector containing the variance of each column of IH_, and std returns its standard deviation.

The hue interval [0, 360] was then uniformly divided into 20 bins of identical size and computed into a hue histogram of the image. Q represents the maximum value this histogram and the hue count was defined as the number of bins containing values greater than CQ. The number of missing hues represents bins with values smaller than cQ. C and c was set to 0.1 and 0.01, respectively. f10=#of{i|h(i)>CQ} f11=#of{i|h(i)<cQ}. Hue contrast and missing hues contrast was computed as: f12=max( f13=maxchichjalwithi,ji|hi<cQ where ch(i) is the center hue of the i-th bin of the histogram and ‖ ⋅ ‖al refers to the arc-length distance on the hue wheel. f14 denotes the percentage of pixels belonging to the most frequent hue: f14=Q/NwhereN=#ofPH f15=20#ofi|hi>C2QwithC2=0.05

Color models: As some color combinations are more pleasant for the human eye than others (Li & Chen, 2009), each image was fit against one of 9 color models (Fig. S2K). As the models can rotate, the k-th model rotated with an angle α as Mk(α), Gk(IH_(m, n) was assigned to the grey part of the respective model. EMk(α)(m, n) was defined as the hue of Gk(α) closest to IH_. EMkαm,n=IH_m,nifIH_m,nGkαHnearestborderifIH_m,nGkα where Hnearestborder is the hue of the sector border in Mk(α) closest to the hue of pixel (m, n). Now the distance between the image and the model Mk(α) can be computed as Fk,α=1mmIS_m,nnmEMkαm,nIH_m,nalIS_m,n with IS_(m, n) accounting for less color differences with lower saturation. This definition of the distance to a model was inspired by Datta et al. (2006) with the addition of a normalization 1mnIs_m,n which allows for a comparison of different sized images. As the distances of an image to each model yield more information than the identity of the single model the image fits best, all distances were calculated and features f16f24 are therefore defined as the smallest distance to each model: f15+k=minαFk,α,k1,,9. Theoretically the best fitting hue model could be defined as Mko(αo) with αk=argminαFk,α,k0=argmink1,9Fk,αkandα0=αk0. Those models are, however, very difficult to fit. Therefore we set a threshold TH assuming that if Fk,α(k) < TH, the picture fits the k-th color model. If ∀k Fk,α(k)TH the picture was fit to the closest model. In case several models could be assigned to an image not the closest one, but the most restrictive was chosen. As the color models are already ordered according to their restrictiveness the fit to the color model we characterize as: f25=maxkj|Fj,αj,THkifk1,,9,Fk,αk<THk0ifkFk,αkTH Normalizing the distances to the models enabled us to set a unique threshold (TH = 10) for all the images independently of their size.

Brightness: Light conditions captured by a given picture are some of the most noticeable features involved in human aesthetic perception. Some information about the light condition is already explored by the previously described color analysis, however, analyzing the brightness provides an even more direct approach to evaluating the light conditions of a given image. There are several ways to measure the brightness of an image. For this study, we implemented analysis which target slightly different brightness contrasts. f26=1MNmnLm,n f27=exp255MNmnlog+Lm,n255 where L(m; n) = (Ir(m; n) + Ig(m; n) + Ib(m; n))/3. f26 represents the arithmetic and f27 the logarithmic average brightness; the latter takes the dynamic range of the brightness into account. Different images can therefore equal in one but differ in the other value. The contrast of brightness was assessed by defining h1 as a histogram with 100 equally sized bins for brightness L(m; n), with d as index for the bin with the maximum energy h1(d) = max(h1). Two indices a and b were set as the interval [a; b] which contains 98% of the energy of h1. The histogram was then analyzed step by step towards both sides starting from the dth bin to identify a and b. The first measure of the brightness contrast is then f28=ba+1. For the second contrast quality feature a brightness histogram h2 with 256 bins comprising the sum of the gray-level histograms hr, hg and hb generated from the red, green and blue channels: h2i=hri+hgi+hbi,i0,,255. The contrast quality f29 is then the width of the smallest interval [a2, b2] where i=a2b2h2i>0.98i=0255h2i. f29=b2a2.

Edge features: Edge repartition was assessed by looking for the smallest bounding box which contains a chosen percentage of the energy of the edges, and compare its area to the area of the entire picture. Although Li & Chen (2009) and Ke, Tang & Jing (2006) offer two different versions to target this feature, both use the absolute value of the output from a 3 × 3 Laplacian filter with α = 0.2. For color images the R, G and B channels are analyzed separately and the mean of the absolute values is used. At the boundaries the values outside the bounds of the matrix was considered equal to the nearest value in the matrix borders. According to Li & Chen (2009) the area of the smallest bounding box, containing 81% of the edge energy of their ‘Laplacian image’ (90% in each direction), was divided by the area of the entire image (Figs. S2ES2H). f30=H90W90/HW H90 and W90 represent the height and width of the bounding box with H and W as the height and width of the image.

Ke, Tang & Jing (2006) resized each Laplacian image initially to 100 × 100 and the image sum was normalized to 1. Subsequently the area of the bounding box containing 96.04% of the edge energy (98% in each direction) was established and the quality of the image was defined as 1 − H98W98, whereby H98 and W98 are the height and width of the bounding box. f31=1H98W98;H98andW980,1. Resizing and normalizing the Laplacian images further allows for an easy comparison of different Laplacian images. Analog to Ke, Tang & Jing (2006) who compared one group of professional quality photos and one group of photos of inferior quality, we can now consider two groups of images: one with pictures of pristine and one with pictures of degraded reefs. Mp and Ms represent the mean Laplacian image of the pictures in each of the respective groups. This allows a comparison of the Laplacian image L with Mp and Ms using the L1-distance. f32=dsdp,where ds=m,n|Lm,nMsm,n| dp=m,n|Lm,nMpm,n|. The sum of edges f33 was added as an additional feature not implemented by one of the above mentioned studies. Sobel image S of a picture was defined as a binary image of identical size, with 1’s assigned to edges present according to the Sobel method and 0’s for no edges present. For a color image Sobel images Sr, Sg and Sb were constructed for each of its red, green and blue channels and the sum of edges defined as f33=|Sr|L1+|Sg|L1+|Sb|L1/3.

Texture analysis: To analyze the texture of pictures more thoroughly we implemented features not yet discussed in Ke, Tang & Jing (2006), Datta et al. (2006), or Li & Chen (2009). Therefore we considered RH to be a matrix of the same size as IH, where each pixel (m, n) contains the range value (maximum value–minimum value) of the 3-by-3 neighborhood surrounding the corresponding pixel in IH. RS and RV were computed in the same way for IS and IV and the range of texture was defined as f34=1MNmnRHm,n+RSm,n+RVm,n/3. Additionally DH, DS, and DV were set as the respective matrix identical in size to IH, IS, and IV, where each pixel (m, n) contains the standard deviation value of the 3-by-3 neighborhood around the corresponding pixel in IH, IS, or IV. The average standard deviation of texture was defined as: f35=1MNmnDHm,n+DSm,n+DVm,n/3. The entropy of an image is a statistical measure of its randomness, and can also be used to characterize its texture. For a gray-level image, it is defined as—i=0255pilog2pi where p is a vector containing the 256 bin gray-level histogram of the image. Thus, we define features f36, f37 and f38 as the entropy of Ir, Ig, and Ib respectively. f36=entropyIr f37=entropyIg f38=entropyIb.

Wavelet based texture: Texture feature analysis based on wavelets was conducted according to Datta et al. (2006). However concrete information on some of the implemented steps (e.g., norm or exact Daubechies wavelet used) was sometimes not available which may result in a slight deviation of the calculation. First a three level wavelet transformation on IH was performed using the Haar Wavelet (see Figs. S2I and S2J). A 2D wavelet transformation of an image yields 4 matrices: the approximation coefficient matrix CA and the three details coefficient matrices CH, CV and CD. Height and width of resulting matrices are 50% of the input image and CH, CV and CD show horizontal, vertical and diagonal details of the image. For a three-level wavelet transformation a 2D wavelet transformation is performed and repeated on the approximation coefficient matrix C1A and repeated again on the new approximation coefficient matrix C2A, resulting in 3 sets of coefficients matrices. The ith-level detail coefficient matrices for the hue image IH were then denoted as CiH,CiV, and CiDI1,2,3. Features f39f41 are then defined as follows: f38+i=1SimnCiHm,n+CiVm,n+CiDm,n,i1,2,3 where ∀i ∈ {1, 2, 3}, Si=|CiH|L1+|CiV|L1+|CiD|L1. Features f42f44 and f45f47 recomputed accordingly for Is and Iv. Features f48f50 are defined as the sum of the three wavelet features for H, S, and V respectively: f48=i=4042fi,f49=i=4345fi,f50=i=4648fi.

Blur: Measurements of the image blur were done based on suggestions given by Li & Chen (2009) and Ke, Tang & Jing (2006). Based on the information provided we were not able to implement the features successfully, thus the features presented here are a modified adaptation. For this purpose each picture was considered to be a blurred image Iblurred as a result of the convolution of an hypothetical sharp version of the image Isharp and a Gausssian filter Gσ : Iblurred = GσIsharp. As the Gaussian filter eliminates high frequencies only, the blur of a picture can be determined by quantifying the frequency of the image above a certain threshold θ. A higher frequency indicates less blur. The threshold θ reduces the noise and provides a defined cutoff of the high frequencies. To quantify blur in a given image, a 2D Fourrier Transform was performed resulting in Y. To avoid ambiguities the 2D Fourrier Transform is then normalized by 1/MN:Y=fft2Iblurred/MN. As we observed a phenomenon of spatial aliasing, only the frequencies (m′, n′) where 0 < m′ < M/2 and 0 < n′ < N/2 were used, resulting in f51=max2mM2M;2nN2N where |Y(m′, n′)| > θ, 0 < m′ < M/2, and 0 < n′ < N/2. The threshold was set as θ = 0.45.

Local features

In addition to global features which provide information about the general aspect of a picture, local features consider fragments of the image. This approach focuses on objects captured in the photograph, while disregarding the overall composition, which is partly dependent on the camera operator. Objects corresponding to uniform regions can be detected with the segmentation process described in Datta et al. (2006). First the image is transformed in the LUV color space and the K means algorithm is used to create K color-based pixel cluster. Then a connected components analysis in an 8-connected neighborhood is performed to generating a list of all segments present. The 5 largest segments are denoted as s1, …s5, in decreasing order. As most pictures contain many details resulting in noise, we applied a uniform blur with m × m ones matrix as kernel before the segmentation process.

Rule of third: A well-known paradigm in photography is that the main subject of attention in a picture should generally be in its central area. This rule is called the ‘Rule of third’ and the ‘central area’ can more precisely defined as the ninth of a photo divided by 1/3 and 2/3 of its height and width (see Figs. S2A and S2B). Using HSV color space f52 defines the average hue H for this region f52=12M3M3+12N3N3+1m=M32M3n=N32N3IHm,n IS and IV are computed accordingly with f53 and f54.

Focus region: Li & Chen (2009)

offer a slightly different approach on the rule of thirds. The study suggests to use HSL color space and argue that focusing exclusively on the central ninth is too restrictive. From this approach, the focus region FR was defined as the central ninth of the respective picture plus a defined percentage μ in its immediate surrounding (Figs. S2A and S2B). For the here presented image analysis we set μ = 0.1. f55=1#ofm,n|m,nFRm,nFRIH_m,n IS_ and IL_ are computed accordingly with f56 and f57.

Segmentation: The segmentation process generates a list L of connected segments in which the 5 largest segments are denoted as s1, …, s5. Our analysis focuses on the largest 3 or 5 segments only. Not only were the properties of each of these segments, but also the quantity of the connected segments in each picture recorded. This provides a proxy for the number of objects and the complexity of each recorded image. f58=#ofL. The number of segments si in L above a certain threshold (f59), and the size of the 5 largest segments si (f60f64) was defined as: f59=#ofsi|#ofsi<MN/100,i1,,5 f59+i=#ofsi/MN,1,,5. To gain information on the position of these 5 biggest segments, the image was divided in 9 equal parts identical to Rule of third feature analysis. Setting (ri, ci) ∈ {1, 2, 3}2 as the indices of the row and column around the centroid of si, features f65 through f69 as were defined, starting on the top left of each image as f64+i=10r+c,1,,5. The average hue, saturation and value were then assessed for each of the objects. Features f70 through f74 were computed as the average hues of each of the segments si, in the HSV color space: f69+i=1#ofsim,nsiIHm,n,i1,,5. Features f75f79 and f80f84 are computed analog for IS and IV respectively. Features f85f87 were further defined as the average brightness of the top 3 segments: f84+i=1#ofsim,nsiLm,n,i1,2,3 lightness L(m, n) has already been defined under ‘Brightness analysis’. This allows us to compare the colors of each of the segments and to evaluate their diversity by measuring the average color spread f88 of their hues. As complementary colors are aesthetically more pleasing together f89 was defined as the average complementary colors among the assessed segments. f88=i=15j=15|hihj|andf89=i=15j=15hihjal where ∀i ∈ 1, …, 5, h(i) = f69+i is the average hue of si.

As round, regular and convex shapes are considered to be generally more beautiful, the presence of such shapes in a picture should increase its aesthetic value. Here we only assessed the shapes of the 3 largest segments in each image. The coordinates of the centers of mass (first-order moment), the variance (second-order centered moment) and skewness (third-order centered moment) was calculated for each of these segments was calculated by defining for all i ∈ {1, 2, 3} f89+i=xi¯=1#ofsim,nsixm,n f92+i=yi¯=1#ofsim,nsiym,n f95+i=1#ofsim,nsixm,nxi¯2+ym,nyi¯2 f98+i=1#ofsim,nsixm,nxi¯3+ym,nyi¯3 where ∀(m, n), (x(m, n), y(m, n)) are the normalized coordinates of pixel (m, n).

Horizontal and vertical coordinates were normalized by height and width of the image to account for different image ratios. To quantify convex shapes in an image f102 was defined as the percentage of image area covered by convex shapes. To reduce noise only R segments p1, …, pR containing more than MN/200 pixels were incorporated in this feature. The convex hull gk was then computed for each pk. A perfectly convex shape pkgk = pk and areapkareagk=1 would be too restrictive for our purposes of analyzing natural objects, so pk was considered convex if areapkareagk>δ. f102=1MNk=1RIareapkareagk>δ|areapk| where I(⋅) is the indicator function and δ = 0.8.

The last features using segmentation measure different types of contrast between the 5 largest segments. Features f103f106 address the hue contrast, the saturation contrast, the brightness contrast, and the blur contrast. First the average hue, saturation, brightness, and the blur for each si was calculated hi=1#ofsim,nsiIHm,n,i1,,5 si=1#ofsim,nsiISm,n,i1,,5 li=1#ofsim,nsiLm,n,i1,,5. To calculate the blur of the segment si, Isi was computed so that ISim,n=Irm,n+Irm,n+Irm,n/3ifm,nsi0otherwise and b(i) defined as blur measure of Isi for all i ∈ {1, …, 5}, analog to the previously described ‘Blur measure’. bi=max2mM2M;2nN2N where |Yi(m′, n′)| > θ, 0 < m′ < M/2 and 0 < n′ < N/2, with Yi=fft2Isi/MN and θ = 0.45. Features f103f106 were then defined as f103=maxi,j1,,5hihjal f104=maxi,j1,,5sisj f105=maxi,j1,,5lilj f106=maxi,j1,,5bibj.

Low depth of field indicators: Finally, according to the method described by Datta et al. (2006) to detect low depth of field (DOF) and macro images, we divided the images into 16 rectangular blocks of identical size M1, …, M16, numbered in row-major order. Applying the notations of the ‘Wavelet based texture’, C3H,C3V, and C3D denote the third level detail coefficient matrices generated by performing a three-level Haar wavelet transform on the hue channel of the image. The low DOF for the hue is then computed as f107=m,nM6M7M10M11C3Hm,n+C3Vm,n+C3Dm,ni=116m,nMiC3Hm,n+C3Vm,n+C3Dm,n and f108 and f109 are calculated similarly for saturation and value.

Machine learning

To reduce the noise and decrease the error, we analyzed multiple methods of determining feature importance. An unsupervised random forests approach was used to identify the most important features (Fig. S1). For every tree in the construction of a random forests, an out-of-bag sample was sent down the tree for calculation and the number of correct predictions was recorded. The variable importance was then generated by comparing the number of correct predictions from the out-of-bag sample to a randomly permuted variant. For each feature, the resulting importance is: 1ntreesalltreesROOBRperm.

A second method was to identify redundant columns before the training. Using a covariance matrix of the 109 features, relationships between columns were analyzed and columns with a correlation greater than 0.90 were clustered into groups. Within every group, features were either directly or mutually related. In order to not compromise the comprehensive approach of the coral reef aesthetic feature analysis the most important features from each group remained in the analysis while highly correlated, less important features within a group were removed. We built neural networks based on these two methods and discerned when removing redundant features we obtained lower mean square errors. Thus, we utilized a total of 97 features when building our ensemble of neural networks.

To fuse the predictive power of the 109 aesthetic features, a Levenberg–Marquardt algorithm was used simultaneously on every sample of the training set to minimize the mean squared error of the estimated output score and the NCEAS value. Typical mean squared error rates were in the 90s. We then decided on a threshold of 60 for the mean squared error and searched the weight space of the neural network to find 10 sets of weights with a mean squared error of less than 60 on the validation set. The predicted NCEAS scores of these 10 networks were then averaged for the ensemble prediction, which is our aesthetic value.

After running test data through the ensemble of neural networks, we further analyze the accuracy of our system by simultaneously testing multiple pictures at a time. To see how much more reliably we could deduce the NCEAS score using N pictures from the same site, we averaged the outputs from our ensemble of neural networks for all twenty choose N (N = 1, 2, 3, 4, 5) combinations available from the test batch. Combinations of multiple pictures increased the accuracy of the root mean square error of 6.57 for N = 1–5.35 for N = 2, 4.88 for N = 3, and 4.46 for both N = 4 and N = 5.

References

 
43 Citations 12,900 Views 2,079 Downloads

Your institution may have Open Access funds available for qualifying authors. See if you qualify

Publish for free

Comment on Articles or Preprints and we'll waive your author fee
Learn more

Five new journals in Chemistry

Free to publish • Peer-reviewed • From PeerJ
Find out more